Overview

Dataset statistics

Number of variables40
Number of observations470116
Missing cells4946693
Missing cells (%)26.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory163.2 MiB
Average record size in memory364.0 B

Variable types

Numeric16
Categorical24

Alerts

FPA_ID has a high cardinality: 470116 distinct valuesHigh cardinality
NWCG_REPORTING_UNIT_ID has a high cardinality: 1347 distinct valuesHigh cardinality
NWCG_REPORTING_UNIT_NAME has a high cardinality: 1343 distinct valuesHigh cardinality
SOURCE_REPORTING_UNIT has a high cardinality: 4077 distinct valuesHigh cardinality
SOURCE_REPORTING_UNIT_NAME has a high cardinality: 3566 distinct valuesHigh cardinality
LOCAL_FIRE_REPORT_ID has a high cardinality: 5048 distinct valuesHigh cardinality
LOCAL_INCIDENT_ID has a high cardinality: 165302 distinct valuesHigh cardinality
FIRE_CODE has a high cardinality: 50725 distinct valuesHigh cardinality
FIRE_NAME has a high cardinality: 151885 distinct valuesHigh cardinality
ICS_209_INCIDENT_NUMBER has a high cardinality: 6070 distinct valuesHigh cardinality
ICS_209_NAME has a high cardinality: 5723 distinct valuesHigh cardinality
MTBS_ID has a high cardinality: 2733 distinct valuesHigh cardinality
MTBS_FIRE_NAME has a high cardinality: 2362 distinct valuesHigh cardinality
COMPLEX_NAME has a high cardinality: 690 distinct valuesHigh cardinality
STATE has a high cardinality: 52 distinct valuesHigh cardinality
COUNTY has a high cardinality: 3006 distinct valuesHigh cardinality
FIPS_NAME has a high cardinality: 1637 distinct valuesHigh cardinality
Shape has a high cardinality: 430309 distinct valuesHigh cardinality
Unnamed: 0 is highly overall correlated with OBJECTID and 13 other fieldsHigh correlation
OBJECTID is highly overall correlated with Unnamed: 0 and 13 other fieldsHigh correlation
FOD_ID is highly overall correlated with Unnamed: 0 and 6 other fieldsHigh correlation
FIRE_YEAR is highly overall correlated with Unnamed: 0 and 5 other fieldsHigh correlation
DISCOVERY_DATE is highly overall correlated with Unnamed: 0 and 5 other fieldsHigh correlation
DISCOVERY_DOY is highly overall correlated with CONT_DOY and 2 other fieldsHigh correlation
DISCOVERY_TIME is highly overall correlated with CONT_TIMEHigh correlation
STAT_CAUSE_CODE is highly overall correlated with Unnamed: 0 and 10 other fieldsHigh correlation
CONT_DATE is highly overall correlated with Unnamed: 0 and 6 other fieldsHigh correlation
CONT_DOY is highly overall correlated with DISCOVERY_DOY and 3 other fieldsHigh correlation
CONT_TIME is highly overall correlated with DISCOVERY_TIMEHigh correlation
LATITUDE is highly overall correlated with Unnamed: 0 and 7 other fieldsHigh correlation
LONGITUDE is highly overall correlated with Unnamed: 0 and 11 other fieldsHigh correlation
OWNER_CODE is highly overall correlated with Unnamed: 0 and 8 other fieldsHigh correlation
SOURCE_SYSTEM_TYPE is highly overall correlated with Unnamed: 0 and 10 other fieldsHigh correlation
SOURCE_SYSTEM is highly overall correlated with Unnamed: 0 and 14 other fieldsHigh correlation
NWCG_REPORTING_AGENCY is highly overall correlated with Unnamed: 0 and 10 other fieldsHigh correlation
STAT_CAUSE_DESCR is highly overall correlated with SOURCE_SYSTEM_TYPE and 4 other fieldsHigh correlation
OWNER_DESCR is highly overall correlated with Unnamed: 0 and 7 other fieldsHigh correlation
STATE is highly overall correlated with Unnamed: 0 and 15 other fieldsHigh correlation
FIPS_CODE is highly overall correlated with STATEHigh correlation
LOCAL_FIRE_REPORT_ID has 364964 (77.6%) missing valuesMissing
LOCAL_INCIDENT_ID has 205164 (43.6%) missing valuesMissing
FIRE_CODE has 388867 (82.7%) missing valuesMissing
FIRE_NAME has 240071 (51.1%) missing valuesMissing
ICS_209_INCIDENT_NUMBER has 463627 (98.6%) missing valuesMissing
ICS_209_NAME has 463627 (98.6%) missing valuesMissing
MTBS_ID has 467334 (99.4%) missing valuesMissing
MTBS_FIRE_NAME has 467334 (99.4%) missing valuesMissing
COMPLEX_NAME has 468800 (99.7%) missing valuesMissing
DISCOVERY_TIME has 220721 (47.0%) missing valuesMissing
CONT_DATE has 222921 (47.4%) missing valuesMissing
CONT_DOY has 222921 (47.4%) missing valuesMissing
CONT_TIME has 243096 (51.7%) missing valuesMissing
COUNTY has 169082 (36.0%) missing valuesMissing
FIPS_CODE has 169082 (36.0%) missing valuesMissing
FIPS_NAME has 169082 (36.0%) missing valuesMissing
FIRE_SIZE is highly skewed (γ1 = 114.0371381)Skewed
FPA_ID is uniformly distributedUniform
ICS_209_INCIDENT_NUMBER is uniformly distributedUniform
ICS_209_NAME is uniformly distributedUniform
MTBS_ID is uniformly distributedUniform
Shape is uniformly distributedUniform
Unnamed: 0 has unique valuesUnique
OBJECTID has unique valuesUnique
FOD_ID has unique valuesUnique
FPA_ID has unique valuesUnique

Reproduction

Analysis started2022-12-07 17:42:42.164313
Analysis finished2022-12-07 17:46:28.421414
Duration3 minutes and 46.26 seconds
Software versionpandas-profiling vv3.5.0
Download configurationconfig.json

Variables

Unnamed: 0
Real number (ℝ)

HIGH CORRELATION
UNIQUE

Distinct470116
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean941263.02
Minimum8
Maximum1880464
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.2 MiB
2022-12-07T19:46:28.589462image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum8
5-th percentile93986.75
Q1472125.25
median941789
Q31411155.5
95-th percentile1786484.2
Maximum1880464
Range1880456
Interquartile range (IQR)939030.25

Descriptive statistics

Standard deviation542587.69
Coefficient of variation (CV)0.57644641
Kurtosis-1.1976586
Mean941263.02
Median Absolute Deviation (MAD)469524.5
Skewness-0.0039471539
Sum4.425028 × 1011
Variance2.944014 × 1011
MonotonicityNot monotonic
2022-12-07T19:46:28.761333image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
894421 1
 
< 0.1%
653827 1
 
< 0.1%
724486 1
 
< 0.1%
576392 1
 
< 0.1%
1845934 1
 
< 0.1%
425989 1
 
< 0.1%
726987 1
 
< 0.1%
544050 1
 
< 0.1%
1423094 1
 
< 0.1%
1606069 1
 
< 0.1%
Other values (470106) 470106
> 99.9%
ValueCountFrequency (%)
8 1
< 0.1%
13 1
< 0.1%
15 1
< 0.1%
18 1
< 0.1%
20 1
< 0.1%
23 1
< 0.1%
29 1
< 0.1%
30 1
< 0.1%
31 1
< 0.1%
32 1
< 0.1%
ValueCountFrequency (%)
1880464 1
< 0.1%
1880462 1
< 0.1%
1880452 1
< 0.1%
1880450 1
< 0.1%
1880449 1
< 0.1%
1880444 1
< 0.1%
1880439 1
< 0.1%
1880434 1
< 0.1%
1880432 1
< 0.1%
1880427 1
< 0.1%

OBJECTID
Real number (ℝ)

HIGH CORRELATION
UNIQUE

Distinct470116
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean941264.02
Minimum9
Maximum1880465
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.2 MiB
2022-12-07T19:46:28.948837image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum9
5-th percentile93987.75
Q1472126.25
median941790
Q31411156.5
95-th percentile1786485.2
Maximum1880465
Range1880456
Interquartile range (IQR)939030.25

Descriptive statistics

Standard deviation542587.69
Coefficient of variation (CV)0.5764458
Kurtosis-1.1976586
Mean941264.02
Median Absolute Deviation (MAD)469524.5
Skewness-0.0039471539
Sum4.4250327 × 1011
Variance2.944014 × 1011
MonotonicityNot monotonic
2022-12-07T19:46:29.105108image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
894422 1
 
< 0.1%
653828 1
 
< 0.1%
724487 1
 
< 0.1%
576393 1
 
< 0.1%
1845935 1
 
< 0.1%
425990 1
 
< 0.1%
726988 1
 
< 0.1%
544051 1
 
< 0.1%
1423095 1
 
< 0.1%
1606070 1
 
< 0.1%
Other values (470106) 470106
> 99.9%
ValueCountFrequency (%)
9 1
< 0.1%
14 1
< 0.1%
16 1
< 0.1%
19 1
< 0.1%
21 1
< 0.1%
24 1
< 0.1%
30 1
< 0.1%
31 1
< 0.1%
32 1
< 0.1%
33 1
< 0.1%
ValueCountFrequency (%)
1880465 1
< 0.1%
1880463 1
< 0.1%
1880453 1
< 0.1%
1880451 1
< 0.1%
1880450 1
< 0.1%
1880445 1
< 0.1%
1880440 1
< 0.1%
1880435 1
< 0.1%
1880433 1
< 0.1%
1880428 1
< 0.1%

FOD_ID
Real number (ℝ)

HIGH CORRELATION
UNIQUE

Distinct470116
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean54850608
Minimum9
Maximum3.003484 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.2 MiB
2022-12-07T19:46:29.292605image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum9
5-th percentile95025.75
Q1507542.25
median1069324
Q319107196
95-th percentile3.0014627 × 108
Maximum3.003484 × 108
Range3.0034839 × 108
Interquartile range (IQR)18599653

Descriptive statistics

Standard deviation1.0115502 × 108
Coefficient of variation (CV)1.8441914
Kurtosis0.58219323
Mean54850608
Median Absolute Deviation (MAD)700045.5
Skewness1.510105
Sum2.5786148 × 1013
Variance1.0232338 × 1016
MonotonicityNot monotonic
2022-12-07T19:46:29.448836image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1020177 1
 
< 0.1%
706542 1
 
< 0.1%
822690 1
 
< 0.1%
622813 1
 
< 0.1%
300271799 1
 
< 0.1%
460013 1
 
< 0.1%
825789 1
 
< 0.1%
585165 1
 
< 0.1%
19501872 1
 
< 0.1%
201627092 1
 
< 0.1%
Other values (470106) 470106
> 99.9%
ValueCountFrequency (%)
9 1
< 0.1%
14 1
< 0.1%
16 1
< 0.1%
19 1
< 0.1%
21 1
< 0.1%
24 1
< 0.1%
30 1
< 0.1%
31 1
< 0.1%
32 1
< 0.1%
33 1
< 0.1%
ValueCountFrequency (%)
300348399 1
< 0.1%
300348375 1
< 0.1%
300348293 1
< 0.1%
300348290 1
< 0.1%
300348289 1
< 0.1%
300348258 1
< 0.1%
300348247 1
< 0.1%
300348212 1
< 0.1%
300348204 1
< 0.1%
300348172 1
< 0.1%

FPA_ID
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE

Distinct470116
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size7.2 MiB
SWRA_LA_22416
 
1
SFO-TX0483-72735
 
1
NM98-10680544X
 
1
SFO-SC02170707-7FF0526
 
1
SFO-2015NY4541NY4541-2015-15150
 
1
Other values (470111)
470111 

Length

Max length49
Median length36
Mean length16.540067
Min length3

Characters and Unicode

Total characters7775750
Distinct characters69
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique470116 ?
Unique (%)100.0%

Sample

1st rowSWRA_LA_22416
2nd rowTFS_NC_220729
3rd rowUT_2828-2000
4th rowFS-1438071
5th rowCDF_1997_55_2223_882

Common Values

ValueCountFrequency (%)
SWRA_LA_22416 1
 
< 0.1%
SFO-TX0483-72735 1
 
< 0.1%
NM98-10680544X 1
 
< 0.1%
SFO-SC02170707-7FF0526 1
 
< 0.1%
SFO-2015NY4541NY4541-2015-15150 1
 
< 0.1%
SFO-GA00680505-42-221-0018-10 1
 
< 0.1%
SC_3715 1
 
< 0.1%
SFO-NJ0285-02_A091204 1
 
< 0.1%
SFO-LA-2010-639 1
 
< 0.1%
SFO-AR-2012-AR6/30/2012111625 1
 
< 0.1%
Other values (470106) 470106
> 99.9%

Length

2022-12-07T19:46:29.683212image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
sfo-2010-vavas 246
 
0.1%
sfo-2013ladaf 242
 
0.1%
sfo-2014vavas 202
 
< 0.1%
sfo-2014ladaf 201
 
< 0.1%
2011vavas 191
 
< 0.1%
sfo-2013vavas 143
 
< 0.1%
sfo-2015vavas 143
 
< 0.1%
2011gagas-fy2011-jeff 22
 
< 0.1%
sfo-ga-fy2001-bryan 21
 
< 0.1%
sfo-ga-fy2002-bryan 19
 
< 0.1%
Other values (470043) 470397
99.7%

Most occurring characters

ValueCountFrequency (%)
0 943372
 
12.1%
- 729312
 
9.4%
1 615778
 
7.9%
2 595893
 
7.7%
S 428543
 
5.5%
F 394154
 
5.1%
3 362156
 
4.7%
5 347980
 
4.5%
4 346555
 
4.5%
9 303688
 
3.9%
Other values (59) 2708319
34.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4253291
54.7%
Uppercase Letter 2297603
29.5%
Dash Punctuation 729312
 
9.4%
Connector Punctuation 244474
 
3.1%
Space Separator 143497
 
1.8%
Lowercase Letter 77720
 
1.0%
Other Punctuation 27350
 
0.4%
Open Punctuation 1251
 
< 0.1%
Close Punctuation 1251
 
< 0.1%
Modifier Symbol 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 428543
18.7%
F 394154
17.2%
O 205816
9.0%
A 180783
7.9%
T 136777
 
6.0%
C 126110
 
5.5%
W 121688
 
5.3%
N 112509
 
4.9%
D 73702
 
3.2%
R 67372
 
2.9%
Other values (16) 450149
19.6%
Lowercase Letter
ValueCountFrequency (%)
e 8818
11.3%
n 7638
9.8%
o 7347
 
9.5%
a 7341
 
9.4%
r 6937
 
8.9%
l 5810
 
7.5%
i 4571
 
5.9%
t 4305
 
5.5%
s 3591
 
4.6%
h 2736
 
3.5%
Other values (14) 18626
24.0%
Decimal Number
ValueCountFrequency (%)
0 943372
22.2%
1 615778
14.5%
2 595893
14.0%
3 362156
 
8.5%
5 347980
 
8.2%
4 346555
 
8.1%
9 303688
 
7.1%
6 284560
 
6.7%
7 231561
 
5.4%
8 221748
 
5.2%
Other Punctuation
ValueCountFrequency (%)
/ 26843
98.1%
. 265
 
1.0%
, 242
 
0.9%
Dash Punctuation
ValueCountFrequency (%)
- 729312
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 244474
100.0%
Space Separator
ValueCountFrequency (%)
143497
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1251
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1251
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5400427
69.5%
Latin 2375323
30.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 428543
18.0%
F 394154
16.6%
O 205816
 
8.7%
A 180783
 
7.6%
T 136777
 
5.8%
C 126110
 
5.3%
W 121688
 
5.1%
N 112509
 
4.7%
D 73702
 
3.1%
R 67372
 
2.8%
Other values (40) 527869
22.2%
Common
ValueCountFrequency (%)
0 943372
17.5%
- 729312
13.5%
1 615778
11.4%
2 595893
11.0%
3 362156
 
6.7%
5 347980
 
6.4%
4 346555
 
6.4%
9 303688
 
5.6%
6 284560
 
5.3%
_ 244474
 
4.5%
Other values (9) 626659
11.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7775750
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 943372
 
12.1%
- 729312
 
9.4%
1 615778
 
7.9%
2 595893
 
7.7%
S 428543
 
5.5%
F 394154
 
5.1%
3 362156
 
4.7%
5 347980
 
4.5%
4 346555
 
4.5%
9 303688
 
3.9%
Other values (59) 2708319
34.8%
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.2 MiB
NONFED
340631 
FED
119998 
INTERAGCY
 
9487

Length

Max length9
Median length6
Mean length5.2947847
Min length3

Characters and Unicode

Total characters2489163
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNONFED
2nd rowNONFED
3rd rowNONFED
4th rowFED
5th rowNONFED

Common Values

ValueCountFrequency (%)
NONFED 340631
72.5%
FED 119998
 
25.5%
INTERAGCY 9487
 
2.0%

Length

2022-12-07T19:46:29.839463image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-07T19:46:29.995727image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
nonfed 340631
72.5%
fed 119998
 
25.5%
interagcy 9487
 
2.0%

Most occurring characters

ValueCountFrequency (%)
N 690749
27.8%
E 470116
18.9%
F 460629
18.5%
D 460629
18.5%
O 340631
13.7%
I 9487
 
0.4%
T 9487
 
0.4%
R 9487
 
0.4%
A 9487
 
0.4%
G 9487
 
0.4%
Other values (2) 18974
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2489163
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 690749
27.8%
E 470116
18.9%
F 460629
18.5%
D 460629
18.5%
O 340631
13.7%
I 9487
 
0.4%
T 9487
 
0.4%
R 9487
 
0.4%
A 9487
 
0.4%
G 9487
 
0.4%
Other values (2) 18974
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 2489163
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 690749
27.8%
E 470116
18.9%
F 460629
18.5%
D 460629
18.5%
O 340631
13.7%
I 9487
 
0.4%
T 9487
 
0.4%
R 9487
 
0.4%
A 9487
 
0.4%
G 9487
 
0.4%
Other values (2) 18974
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2489163
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 690749
27.8%
E 470116
18.9%
F 460629
18.5%
D 460629
18.5%
O 340631
13.7%
I 9487
 
0.4%
T 9487
 
0.4%
R 9487
 
0.4%
A 9487
 
0.4%
G 9487
 
0.4%
Other values (2) 18974
 
0.8%

SOURCE_SYSTEM
Categorical

Distinct38
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.2 MiB
ST-NASF
177686 
DOI-WFMI
59857 
FS-FIRESTAT
55257 
ST-CACDF
21802 
ST-NCNCS
 
16416
Other values (33)
139098 

Length

Max length11
Median length9
Mean length7.9875095
Min length7

Characters and Unicode

Total characters3755056
Distinct characters27
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowST-LALAS
2nd rowST-NCNCS
3rd rowST-UTUTS
4th rowFS-FIRESTAT
5th rowST-CACDF

Common Values

ValueCountFrequency (%)
ST-NASF 177686
37.8%
DOI-WFMI 59857
 
12.7%
FS-FIRESTAT 55257
 
11.8%
ST-CACDF 21802
 
4.6%
ST-NCNCS 16416
 
3.5%
ST-GAGAS 16203
 
3.4%
ST-MSMSS 15046
 
3.2%
ST-TXTXS 14504
 
3.1%
ST-ALALS 13901
 
3.0%
ST-SCSCS 12209
 
2.6%
Other values (28) 67235
 
14.3%

Length

2022-12-07T19:46:30.105086image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
st-nasf 177686
37.8%
doi-wfmi 59857
 
12.7%
fs-firestat 55257
 
11.8%
st-cacdf 21802
 
4.6%
st-ncncs 16416
 
3.5%
st-gagas 16203
 
3.4%
st-msmss 15046
 
3.2%
st-txtxs 14504
 
3.1%
st-alals 13901
 
3.0%
st-scscs 12209
 
2.6%
Other values (28) 67235
 
14.3%

Most occurring characters

ValueCountFrequency (%)
S 835630
22.3%
T 497430
13.2%
- 470116
12.5%
F 409347
10.9%
A 352873
9.4%
N 220927
 
5.9%
I 212641
 
5.7%
M 110536
 
2.9%
C 106117
 
2.8%
O 85678
 
2.3%
Other values (17) 453761
12.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 3283377
87.4%
Dash Punctuation 470116
 
12.5%
Decimal Number 1563
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 835630
25.5%
T 497430
15.1%
F 409347
12.5%
A 352873
10.7%
N 220927
 
6.7%
I 212641
 
6.5%
M 110536
 
3.4%
C 106117
 
3.2%
O 85678
 
2.6%
D 83379
 
2.5%
Other values (13) 368819
11.2%
Decimal Number
ValueCountFrequency (%)
2 521
33.3%
0 521
33.3%
9 521
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 470116
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3283377
87.4%
Common 471679
 
12.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 835630
25.5%
T 497430
15.1%
F 409347
12.5%
A 352873
10.7%
N 220927
 
6.7%
I 212641
 
6.5%
M 110536
 
3.4%
C 106117
 
3.2%
O 85678
 
2.6%
D 83379
 
2.5%
Other values (13) 368819
11.2%
Common
ValueCountFrequency (%)
- 470116
99.7%
2 521
 
0.1%
0 521
 
0.1%
9 521
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3755056
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 835630
22.3%
T 497430
13.2%
- 470116
12.5%
F 409347
10.9%
A 352873
9.4%
N 220927
 
5.9%
I 212641
 
5.7%
M 110536
 
2.9%
C 106117
 
2.8%
O 85678
 
2.3%
Other values (17) 453761
12.1%
Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.2 MiB
ST/C&L
344485 
FS
55290 
BIA
 
29700
BLM
 
24177
IA
 
5536
Other values (5)
 
10928

Length

Max length6
Median length6
Mean length5.0726523
Min length2

Characters and Unicode

Total characters2384735
Distinct characters18
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowST/C&L
2nd rowST/C&L
3rd rowST/C&L
4th rowFS
5th rowST/C&L

Common Values

ValueCountFrequency (%)
ST/C&L 344485
73.3%
FS 55290
 
11.8%
BIA 29700
 
6.3%
BLM 24177
 
5.1%
IA 5536
 
1.2%
NPS 5138
 
1.1%
FWS 4887
 
1.0%
TRIBE 879
 
0.2%
DOD 20
 
< 0.1%
BOR 4
 
< 0.1%

Length

2022-12-07T19:46:30.261516image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-07T19:46:30.402126image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
st/c&l 344485
73.3%
fs 55290
 
11.8%
bia 29700
 
6.3%
blm 24177
 
5.1%
ia 5536
 
1.2%
nps 5138
 
1.1%
fws 4887
 
1.0%
tribe 879
 
0.2%
dod 20
 
< 0.1%
bor 4
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
S 409800
17.2%
L 368662
15.5%
T 345364
14.5%
/ 344485
14.4%
C 344485
14.4%
& 344485
14.4%
F 60177
 
2.5%
B 54760
 
2.3%
I 36115
 
1.5%
A 35236
 
1.5%
Other values (8) 41166
 
1.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1695765
71.1%
Other Punctuation 688970
28.9%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 409800
24.2%
L 368662
21.7%
T 345364
20.4%
C 344485
20.3%
F 60177
 
3.5%
B 54760
 
3.2%
I 36115
 
2.1%
A 35236
 
2.1%
M 24177
 
1.4%
N 5138
 
0.3%
Other values (6) 11851
 
0.7%
Other Punctuation
ValueCountFrequency (%)
/ 344485
50.0%
& 344485
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1695765
71.1%
Common 688970
28.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 409800
24.2%
L 368662
21.7%
T 345364
20.4%
C 344485
20.3%
F 60177
 
3.5%
B 54760
 
3.2%
I 36115
 
2.1%
A 35236
 
2.1%
M 24177
 
1.4%
N 5138
 
0.3%
Other values (6) 11851
 
0.7%
Common
ValueCountFrequency (%)
/ 344485
50.0%
& 344485
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2384735
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 409800
17.2%
L 368662
15.5%
T 345364
14.5%
/ 344485
14.4%
C 344485
14.4%
& 344485
14.4%
F 60177
 
2.5%
B 54760
 
2.3%
I 36115
 
1.5%
A 35236
 
1.5%
Other values (8) 41166
 
1.7%
Distinct1347
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.2 MiB
USGAGAS
41647 
USTXTXS
 
27863
USNCNCS
 
26817
USFLFLS
 
20786
USSCSCS
 
19528
Other values (1342)
333475 

Length

Max length9
Median length7
Mean length7.0310306
Min length7

Characters and Unicode

Total characters3305400
Distinct characters36
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique230 ?
Unique (%)< 0.1%

Sample

1st rowUSLALAS
2nd rowUSNCNCS
3rd rowUSUTUTS
4th rowUSMTKNF
5th rowUSCAMVU

Common Values

ValueCountFrequency (%)
USGAGAS 41647
 
8.9%
USTXTXS 27863
 
5.9%
USNCNCS 26817
 
5.7%
USFLFLS 20786
 
4.4%
USSCSCS 19528
 
4.2%
USNYNYX 18812
 
4.0%
USMSMSS 18655
 
4.0%
USALALS 16369
 
3.5%
USOKOKS 7398
 
1.6%
USMNMNS 7327
 
1.6%
Other values (1337) 264914
56.4%

Length

2022-12-07T19:46:30.558369image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
usgagas 41647
 
8.9%
ustxtxs 27863
 
5.9%
usncncs 26817
 
5.7%
usflfls 20786
 
4.4%
usscscs 19528
 
4.2%
usnynyx 18812
 
4.0%
usmsmss 18655
 
4.0%
usalals 16369
 
3.5%
usokoks 7398
 
1.6%
usmnmns 7327
 
1.6%
Other values (1337) 264914
56.4%

Most occurring characters

ValueCountFrequency (%)
S 877876
26.6%
U 514903
15.6%
A 299340
 
9.1%
N 202269
 
6.1%
C 187328
 
5.7%
T 126772
 
3.8%
M 126744
 
3.8%
F 116840
 
3.5%
L 108592
 
3.3%
G 90844
 
2.7%
Other values (26) 653892
19.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 3291626
99.6%
Decimal Number 13774
 
0.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 877876
26.7%
U 514903
15.6%
A 299340
 
9.1%
N 202269
 
6.1%
C 187328
 
5.7%
T 126772
 
3.9%
M 126744
 
3.9%
F 116840
 
3.5%
L 108592
 
3.3%
G 90844
 
2.8%
Other values (16) 640118
19.4%
Decimal Number
ValueCountFrequency (%)
1 2807
20.4%
7 2690
19.5%
5 2047
14.9%
9 1802
13.1%
2 1772
12.9%
8 956
 
6.9%
3 802
 
5.8%
0 551
 
4.0%
4 328
 
2.4%
6 19
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 3291626
99.6%
Common 13774
 
0.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 877876
26.7%
U 514903
15.6%
A 299340
 
9.1%
N 202269
 
6.1%
C 187328
 
5.7%
T 126772
 
3.9%
M 126744
 
3.9%
F 116840
 
3.5%
L 108592
 
3.3%
G 90844
 
2.8%
Other values (16) 640118
19.4%
Common
ValueCountFrequency (%)
1 2807
20.4%
7 2690
19.5%
5 2047
14.9%
9 1802
13.1%
2 1772
12.9%
8 956
 
6.9%
3 802
 
5.8%
0 551
 
4.0%
4 328
 
2.4%
6 19
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3305400
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 877876
26.6%
U 514903
15.6%
A 299340
 
9.1%
N 202269
 
6.1%
C 187328
 
5.7%
T 126772
 
3.8%
M 126744
 
3.8%
F 116840
 
3.5%
L 108592
 
3.3%
G 90844
 
2.7%
Other values (26) 653892
19.8%
Distinct1343
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.2 MiB
Georgia Forestry Commission
41647 
Texas A & M Forest Service
 
27863
North Carolina Forest Service
 
26817
Florida Forest Service
 
20786
South Carolina Forestry Commission
 
19528
Other values (1338)
333475 

Length

Max length79
Median length59
Mean length27.402694
Min length5

Characters and Unicode

Total characters12882445
Distinct characters63
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique227 ?
Unique (%)< 0.1%

Sample

1st rowLouisiana Office of Forestry
2nd rowNorth Carolina Forest Service
3rd rowUtah Division Forestry Fire State Lands
4th rowKootenai National Forest
5th rowMonte Vista Unit

Common Values

ValueCountFrequency (%)
Georgia Forestry Commission 41647
 
8.9%
Texas A & M Forest Service 27863
 
5.9%
North Carolina Forest Service 26817
 
5.7%
Florida Forest Service 20786
 
4.4%
South Carolina Forestry Commission 19528
 
4.2%
Fire Department of New York 18812
 
4.0%
Mississippi Forestry Commission 18655
 
4.0%
Alabama Forestry Commission 16369
 
3.5%
Oklahoma Division of Forestry 7398
 
1.6%
Minnesota Department of Natural Resources 7327
 
1.6%
Other values (1333) 264914
56.4%

Length

2022-12-07T19:46:30.730244image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
forestry 163725
 
9.3%
forest 146665
 
8.3%
commission 103217
 
5.8%
of 93717
 
5.3%
service 91983
 
5.2%
national 63996
 
3.6%
department 51553
 
2.9%
carolina 46945
 
2.7%
43379
 
2.5%
georgia 41647
 
2.4%
Other values (1352) 922735
52.1%

Most occurring characters

ValueCountFrequency (%)
1300579
 
10.1%
e 1186807
 
9.2%
o 1132310
 
8.8%
r 1066270
 
8.3%
i 1052541
 
8.2%
s 926899
 
7.2%
t 835682
 
6.5%
a 831152
 
6.5%
n 646885
 
5.0%
F 393074
 
3.1%
Other values (53) 3510246
27.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9846049
76.4%
Uppercase Letter 1659226
 
12.9%
Space Separator 1300579
 
10.1%
Dash Punctuation 41761
 
0.3%
Other Punctuation 34787
 
0.3%
Open Punctuation 20
 
< 0.1%
Close Punctuation 20
 
< 0.1%
Decimal Number 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1186807
12.1%
o 1132310
11.5%
r 1066270
10.8%
i 1052541
10.7%
s 926899
9.4%
t 835682
8.5%
a 831152
8.4%
n 646885
6.6%
m 307188
 
3.1%
l 293414
 
3.0%
Other values (16) 1566901
15.9%
Uppercase Letter
ValueCountFrequency (%)
F 393074
23.7%
C 194092
11.7%
S 168716
10.2%
N 155694
 
9.4%
D 117805
 
7.1%
A 101119
 
6.1%
M 94231
 
5.7%
T 64929
 
3.9%
G 49810
 
3.0%
R 43486
 
2.6%
Other values (15) 276270
16.7%
Other Punctuation
ValueCountFrequency (%)
& 29496
84.8%
/ 2823
 
8.1%
. 1252
 
3.6%
' 1211
 
3.5%
# 3
 
< 0.1%
" 2
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
4 2
66.7%
1 1
33.3%
Space Separator
ValueCountFrequency (%)
1300579
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 41761
100.0%
Open Punctuation
ValueCountFrequency (%)
( 20
100.0%
Close Punctuation
ValueCountFrequency (%)
) 20
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11505275
89.3%
Common 1377170
 
10.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1186807
 
10.3%
o 1132310
 
9.8%
r 1066270
 
9.3%
i 1052541
 
9.1%
s 926899
 
8.1%
t 835682
 
7.3%
a 831152
 
7.2%
n 646885
 
5.6%
F 393074
 
3.4%
m 307188
 
2.7%
Other values (41) 3126467
27.2%
Common
ValueCountFrequency (%)
1300579
94.4%
- 41761
 
3.0%
& 29496
 
2.1%
/ 2823
 
0.2%
. 1252
 
0.1%
' 1211
 
0.1%
( 20
 
< 0.1%
) 20
 
< 0.1%
# 3
 
< 0.1%
4 2
 
< 0.1%
Other values (2) 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12882445
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1300579
 
10.1%
e 1186807
 
9.2%
o 1132310
 
8.8%
r 1066270
 
8.3%
i 1052541
 
8.2%
s 926899
 
7.2%
t 835682
 
6.5%
a 831152
 
6.5%
n 646885
 
5.0%
F 393074
 
3.1%
Other values (53) 3510246
27.2%
Distinct4077
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size7.2 MiB
GAGAS
 
24380
SCSCS
 
12754
TXTXS
 
10120
FLFLS
 
9549
NCNCS
 
9320
Other values (4072)
403993 

Length

Max length21
Median length5
Mean length5.5718184
Min length2

Characters and Unicode

Total characters2619401
Distinct characters62
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique764 ?
Unique (%)0.2%

Sample

1st rowLALAS7
2nd rowNCNCS203
3rd rowUTUTS
4th row0114
5th rowCAMVU

Common Values

ValueCountFrequency (%)
GAGAS 24380
 
5.2%
SCSCS 12754
 
2.7%
TXTXS 10120
 
2.2%
FLFLS 9549
 
2.0%
NCNCS 9320
 
2.0%
TXVFD 9015
 
1.9%
MSMSS 7977
 
1.7%
MNMNS 5957
 
1.3%
PRIITF 5522
 
1.2%
WVDOF 4253
 
0.9%
Other values (4067) 371269
79.0%

Length

2022-12-07T19:46:30.917752image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
gagas 24380
 
4.9%
ga 12882
 
2.6%
scscs 12754
 
2.5%
txtxs 10120
 
2.0%
flfls 9549
 
1.9%
ncncs 9320
 
1.9%
txvfd 9015
 
1.8%
msmss 7977
 
1.6%
ms 7085
 
1.4%
mnmns 5957
 
1.2%
Other values (4083) 391216
78.2%

Most occurring characters

ValueCountFrequency (%)
S 308073
 
11.8%
A 255753
 
9.8%
C 173944
 
6.6%
N 158456
 
6.0%
0 119239
 
4.6%
L 100917
 
3.9%
F 98624
 
3.8%
M 98035
 
3.7%
T 95476
 
3.6%
D 87556
 
3.3%
Other values (52) 1123328
42.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1961441
74.9%
Decimal Number 431940
 
16.5%
Lowercase Letter 189088
 
7.2%
Space Separator 30876
 
1.2%
Dash Punctuation 5515
 
0.2%
Connector Punctuation 538
 
< 0.1%
Other Punctuation 3
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 308073
15.7%
A 255753
13.0%
C 173944
 
8.9%
N 158456
 
8.1%
L 100917
 
5.1%
F 98624
 
5.0%
M 98035
 
5.0%
T 95476
 
4.9%
D 87556
 
4.5%
G 77878
 
4.0%
Other values (16) 506729
25.8%
Lowercase Letter
ValueCountFrequency (%)
t 21680
11.5%
e 21673
11.5%
a 21126
11.2%
s 16422
8.7%
o 15933
8.4%
l 13146
 
7.0%
i 12154
 
6.4%
r 10847
 
5.7%
h 9650
 
5.1%
u 8137
 
4.3%
Other values (12) 38320
20.3%
Decimal Number
ValueCountFrequency (%)
0 119239
27.6%
1 77997
18.1%
2 49243
11.4%
3 40780
 
9.4%
4 34004
 
7.9%
5 33005
 
7.6%
6 26884
 
6.2%
8 20002
 
4.6%
7 16118
 
3.7%
9 14668
 
3.4%
Space Separator
ValueCountFrequency (%)
30876
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5515
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 538
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2150529
82.1%
Common 468872
 
17.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 308073
14.3%
A 255753
 
11.9%
C 173944
 
8.1%
N 158456
 
7.4%
L 100917
 
4.7%
F 98624
 
4.6%
M 98035
 
4.6%
T 95476
 
4.4%
D 87556
 
4.1%
G 77878
 
3.6%
Other values (38) 695817
32.4%
Common
ValueCountFrequency (%)
0 119239
25.4%
1 77997
16.6%
2 49243
10.5%
3 40780
 
8.7%
4 34004
 
7.3%
5 33005
 
7.0%
30876
 
6.6%
6 26884
 
5.7%
8 20002
 
4.3%
7 16118
 
3.4%
Other values (4) 20724
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2619401
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 308073
 
11.8%
A 255753
 
9.8%
C 173944
 
6.6%
N 158456
 
6.0%
0 119239
 
4.6%
L 100917
 
3.9%
F 98624
 
3.8%
M 98035
 
3.7%
T 95476
 
3.6%
D 87556
 
3.3%
Other values (52) 1123328
42.9%
Distinct3566
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size7.2 MiB
Georgia Forestry Commission
 
24380
Fire Department of New York
 
18812
South Carolina Forestry Commission
 
12754
Mississippi Forestry Commission
 
11570
Texas Forest Service
 
10713
Other values (3561)
391887 

Length

Max length72
Median length55
Mean length25.998211
Min length5

Characters and Unicode

Total characters12222175
Distinct characters75
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique672 ?
Unique (%)0.1%

Sample

1st rowLAS District 7
2nd rowNCS Region 2 District 3
3rd rowUtah Division Forestry Fire State Lands
4th rowKootenai National Forest
5th rowCDF - Monte Vista Unit

Common Values

ValueCountFrequency (%)
Georgia Forestry Commission 24380
 
5.2%
Fire Department of New York 18812
 
4.0%
South Carolina Forestry Commission 12754
 
2.7%
Mississippi Forestry Commission 11570
 
2.5%
Texas Forest Service 10713
 
2.3%
North Carolina Division of Forest Resources 9939
 
2.1%
Florida Forest Service 9549
 
2.0%
Minnesota Department of Natural Resources 7327
 
1.6%
International Institute of Tropical Forestry 5522
 
1.2%
Alabama Forestry Commission 5347
 
1.1%
Other values (3556) 354203
75.3%

Length

2022-12-07T19:46:31.089627image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
forestry 103351
 
5.9%
forest 103084
 
5.9%
district 82651
 
4.7%
of 82182
 
4.7%
national 62675
 
3.6%
commission 56759
 
3.2%
department 54866
 
3.1%
unit 47177
 
2.7%
fire 43726
 
2.5%
service 39930
 
2.3%
Other values (2839) 1078808
61.5%

Most occurring characters

ValueCountFrequency (%)
1285249
 
10.5%
e 1084482
 
8.9%
i 1002637
 
8.2%
o 908615
 
7.4%
r 896850
 
7.3%
t 895294
 
7.3%
s 757233
 
6.2%
a 721696
 
5.9%
n 628435
 
5.1%
F 331233
 
2.7%
Other values (65) 3710451
30.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8915854
72.9%
Uppercase Letter 1857306
 
15.2%
Space Separator 1285249
 
10.5%
Decimal Number 79770
 
0.7%
Dash Punctuation 55123
 
0.5%
Other Punctuation 21725
 
0.2%
Open Punctuation 3567
 
< 0.1%
Close Punctuation 3567
 
< 0.1%
Modifier Symbol 14
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1084482
12.2%
i 1002637
11.2%
o 908615
10.2%
r 896850
10.1%
t 895294
10.0%
s 757233
8.5%
a 721696
8.1%
n 628435
7.0%
l 296202
 
3.3%
c 295473
 
3.3%
Other values (17) 1428937
16.0%
Uppercase Letter
ValueCountFrequency (%)
F 331233
17.8%
S 216538
11.7%
D 214050
11.5%
C 193636
10.4%
N 158269
8.5%
A 108938
 
5.9%
R 81469
 
4.4%
M 71673
 
3.9%
U 53909
 
2.9%
G 53586
 
2.9%
Other values (16) 374005
20.1%
Decimal Number
ValueCountFrequency (%)
2 20302
25.5%
1 16410
20.6%
3 14647
18.4%
4 6844
 
8.6%
6 5056
 
6.3%
5 4620
 
5.8%
0 3614
 
4.5%
8 3430
 
4.3%
9 2461
 
3.1%
7 2386
 
3.0%
Other Punctuation
ValueCountFrequency (%)
, 14616
67.3%
. 4482
 
20.6%
& 1528
 
7.0%
/ 441
 
2.0%
' 418
 
1.9%
# 238
 
1.1%
" 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1285249
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 55123
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3567
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3567
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 14
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10773160
88.1%
Common 1449015
 
11.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1084482
 
10.1%
i 1002637
 
9.3%
o 908615
 
8.4%
r 896850
 
8.3%
t 895294
 
8.3%
s 757233
 
7.0%
a 721696
 
6.7%
n 628435
 
5.8%
F 331233
 
3.1%
l 296202
 
2.7%
Other values (43) 3250483
30.2%
Common
ValueCountFrequency (%)
1285249
88.7%
- 55123
 
3.8%
2 20302
 
1.4%
1 16410
 
1.1%
3 14647
 
1.0%
, 14616
 
1.0%
4 6844
 
0.5%
6 5056
 
0.3%
5 4620
 
0.3%
. 4482
 
0.3%
Other values (12) 21666
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12222164
> 99.9%
None 11
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1285249
 
10.5%
e 1084482
 
8.9%
i 1002637
 
8.2%
o 908615
 
7.4%
r 896850
 
7.3%
t 895294
 
7.3%
s 757233
 
6.2%
a 721696
 
5.9%
n 628435
 
5.1%
F 331233
 
2.7%
Other values (64) 3710440
30.4%
None
ValueCountFrequency (%)
ñ 11
100.0%

LOCAL_FIRE_REPORT_ID
Categorical

HIGH CARDINALITY
MISSING

Distinct5048
Distinct (%)4.8%
Missing364964
Missing (%)77.6%
Memory size7.2 MiB
001
 
2087
002
 
1228
2
 
913
5
 
884
1
 
880
Other values (5043)
99160 

Length

Max length6
Median length5
Mean length2.4980029
Min length1

Characters and Unicode

Total characters262670
Distinct characters29
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3330 ?
Unique (%)3.2%

Sample

1st row203
2nd row46
3rd row557106
4th row654
5th row75

Common Values

ValueCountFrequency (%)
001 2087
 
0.4%
002 1228
 
0.3%
2 913
 
0.2%
5 884
 
0.2%
1 880
 
0.2%
6 850
 
0.2%
3 845
 
0.2%
003 842
 
0.2%
4 838
 
0.2%
8 837
 
0.2%
Other values (5038) 94948
 
20.2%
(Missing) 364964
77.6%

Length

2022-12-07T19:46:31.245871image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
001 2087
 
2.0%
002 1228
 
1.2%
2 913
 
0.9%
5 884
 
0.8%
1 880
 
0.8%
6 850
 
0.8%
3 845
 
0.8%
003 842
 
0.8%
4 838
 
0.8%
8 837
 
0.8%
Other values (5037) 94948
90.3%

Most occurring characters

ValueCountFrequency (%)
1 47643
18.1%
0 39671
15.1%
2 31722
12.1%
3 26016
9.9%
4 23772
9.1%
5 21535
8.2%
6 19611
7.5%
7 18162
 
6.9%
8 17180
 
6.5%
9 16928
 
6.4%
Other values (19) 430
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 262240
99.8%
Uppercase Letter 429
 
0.2%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
B 59
13.8%
C 54
12.6%
A 50
11.7%
D 32
 
7.5%
P 24
 
5.6%
F 23
 
5.4%
J 22
 
5.1%
T 21
 
4.9%
M 20
 
4.7%
E 20
 
4.7%
Other values (8) 104
24.2%
Decimal Number
ValueCountFrequency (%)
1 47643
18.2%
0 39671
15.1%
2 31722
12.1%
3 26016
9.9%
4 23772
9.1%
5 21535
8.2%
6 19611
7.5%
7 18162
 
6.9%
8 17180
 
6.6%
9 16928
 
6.5%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 262241
99.8%
Latin 429
 
0.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
B 59
13.8%
C 54
12.6%
A 50
11.7%
D 32
 
7.5%
P 24
 
5.6%
F 23
 
5.4%
J 22
 
5.1%
T 21
 
4.9%
M 20
 
4.7%
E 20
 
4.7%
Other values (8) 104
24.2%
Common
ValueCountFrequency (%)
1 47643
18.2%
0 39671
15.1%
2 31722
12.1%
3 26016
9.9%
4 23772
9.1%
5 21535
8.2%
6 19611
7.5%
7 18162
 
6.9%
8 17180
 
6.6%
9 16928
 
6.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 262670
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 47643
18.1%
0 39671
15.1%
2 31722
12.1%
3 26016
9.9%
4 23772
9.1%
5 21535
8.2%
6 19611
7.5%
7 18162
 
6.9%
8 17180
 
6.5%
9 16928
 
6.4%
Other values (19) 430
 
0.2%

LOCAL_INCIDENT_ID
Categorical

HIGH CARDINALITY
MISSING

Distinct165302
Distinct (%)62.4%
Missing205164
Missing (%)43.6%
Memory size7.2 MiB
001
 
969
1
 
780
2
 
714
3
 
628
4
 
614
Other values (165297)
261247 

Length

Max length28
Median length25
Mean length8.1476871
Min length1

Characters and Unicode

Total characters2158746
Distinct characters71
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique148201 ?
Unique (%)55.9%

Sample

1st rowLA7-U2
2nd row01-083
3rd row2828-2000
4th row47
5th row882

Common Values

ValueCountFrequency (%)
001 969
 
0.2%
1 780
 
0.2%
2 714
 
0.2%
3 628
 
0.1%
4 614
 
0.1%
002 607
 
0.1%
5 590
 
0.1%
10 564
 
0.1%
6 559
 
0.1%
7 510
 
0.1%
Other values (165292) 258417
55.0%
(Missing) 205164
43.6%

Length

2022-12-07T19:46:31.402123image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
001 970
 
0.4%
1 784
 
0.3%
2 716
 
0.3%
3 632
 
0.2%
4 620
 
0.2%
002 607
 
0.2%
5 596
 
0.2%
10 575
 
0.2%
6 564
 
0.2%
7 516
 
0.2%
Other values (161549) 259226
97.5%

Most occurring characters

ValueCountFrequency (%)
0 439896
20.4%
1 231454
10.7%
2 202329
9.4%
141708
 
6.6%
- 137901
 
6.4%
3 126739
 
5.9%
4 118927
 
5.5%
5 111827
 
5.2%
9 99620
 
4.6%
6 90708
 
4.2%
Other values (61) 457637
21.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1598388
74.0%
Uppercase Letter 194353
 
9.0%
Space Separator 141708
 
6.6%
Dash Punctuation 137901
 
6.4%
Lowercase Letter 69895
 
3.2%
Other Punctuation 14077
 
0.7%
Open Punctuation 1075
 
< 0.1%
Close Punctuation 1075
 
< 0.1%
Connector Punctuation 273
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
Y 31828
16.4%
N 26682
13.7%
F 20008
10.3%
A 14562
 
7.5%
S 13421
 
6.9%
C 11672
 
6.0%
T 8528
 
4.4%
M 8517
 
4.4%
L 7426
 
3.8%
U 7326
 
3.8%
Other values (16) 44383
22.8%
Lowercase Letter
ValueCountFrequency (%)
e 7907
11.3%
n 6874
9.8%
a 6634
 
9.5%
o 6607
 
9.5%
r 6224
 
8.9%
l 5164
 
7.4%
i 4094
 
5.9%
t 3929
 
5.6%
s 3222
 
4.6%
h 2494
 
3.6%
Other values (14) 16746
24.0%
Decimal Number
ValueCountFrequency (%)
0 439896
27.5%
1 231454
14.5%
2 202329
12.7%
3 126739
 
7.9%
4 118927
 
7.4%
5 111827
 
7.0%
9 99620
 
6.2%
6 90708
 
5.7%
7 88827
 
5.6%
8 88061
 
5.5%
Other Punctuation
ValueCountFrequency (%)
. 10389
73.8%
/ 3421
 
24.3%
, 242
 
1.7%
# 21
 
0.1%
? 4
 
< 0.1%
Space Separator
ValueCountFrequency (%)
141708
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 137901
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1075
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1075
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 273
100.0%
Math Symbol
ValueCountFrequency (%)
= 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1894498
87.8%
Latin 264248
 
12.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
Y 31828
 
12.0%
N 26682
 
10.1%
F 20008
 
7.6%
A 14562
 
5.5%
S 13421
 
5.1%
C 11672
 
4.4%
T 8528
 
3.2%
M 8517
 
3.2%
e 7907
 
3.0%
L 7426
 
2.8%
Other values (40) 113697
43.0%
Common
ValueCountFrequency (%)
0 439896
23.2%
1 231454
12.2%
2 202329
10.7%
141708
 
7.5%
- 137901
 
7.3%
3 126739
 
6.7%
4 118927
 
6.3%
5 111827
 
5.9%
9 99620
 
5.3%
6 90708
 
4.8%
Other values (11) 193389
10.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2158746
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 439896
20.4%
1 231454
10.7%
2 202329
9.4%
141708
 
6.6%
- 137901
 
6.4%
3 126739
 
5.9%
4 118927
 
5.5%
5 111827
 
5.2%
9 99620
 
4.6%
6 90708
 
4.2%
Other values (61) 457637
21.2%

FIRE_CODE
Categorical

HIGH CARDINALITY
MISSING

Distinct50725
Distinct (%)62.4%
Missing388867
Missing (%)82.7%
Memory size7.2 MiB
D44Z
 
2399
5555
 
1338
0001
 
847
D5GJ
 
841
2300
 
470
Other values (50720)
75354 

Length

Max length6
Median length4
Mean length3.99856
Min length1

Characters and Unicode

Total characters324879
Distinct characters44
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique46169 ?
Unique (%)56.8%

Sample

1st rowB6TB
2nd rowE156
3rd rowD887
4th row0339
5th row5555

Common Values

ValueCountFrequency (%)
D44Z 2399
 
0.5%
5555 1338
 
0.3%
0001 847
 
0.2%
D5GJ 841
 
0.2%
2300 470
 
0.1%
0000 461
 
0.1%
4700 277
 
0.1%
EKT5 242
 
0.1%
0100 241
 
0.1%
EKV3 239
 
0.1%
Other values (50715) 73894
 
15.7%
(Missing) 388867
82.7%

Length

2022-12-07T19:46:31.558392image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
d44z 2399
 
3.0%
5555 1338
 
1.6%
0001 847
 
1.0%
d5gj 841
 
1.0%
2300 470
 
0.6%
0000 461
 
0.6%
4700 277
 
0.3%
ekt5 242
 
0.3%
0100 241
 
0.3%
ekv3 239
 
0.3%
Other values (50715) 73894
90.9%

Most occurring characters

ValueCountFrequency (%)
0 26451
 
8.1%
5 19722
 
6.1%
4 18319
 
5.6%
E 17641
 
5.4%
1 16211
 
5.0%
2 15987
 
4.9%
6 15976
 
4.9%
3 13938
 
4.3%
D 12972
 
4.0%
7 12684
 
3.9%
Other values (34) 154978
47.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 163065
50.2%
Uppercase Letter 161801
49.8%
Lowercase Letter 4
 
< 0.1%
Dash Punctuation 3
 
< 0.1%
Space Separator 2
 
< 0.1%
Other Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%
Currency Symbol 1
 
< 0.1%
Modifier Symbol 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 17641
 
10.9%
D 12972
 
8.0%
K 11557
 
7.1%
B 10961
 
6.8%
C 9841
 
6.1%
G 8160
 
5.0%
F 7960
 
4.9%
H 7770
 
4.8%
J 7665
 
4.7%
A 7143
 
4.4%
Other values (14) 60131
37.2%
Decimal Number
ValueCountFrequency (%)
0 26451
16.2%
5 19722
12.1%
4 18319
11.2%
1 16211
9.9%
2 15987
9.8%
6 15976
9.8%
3 13938
8.5%
7 12684
7.8%
8 12440
7.6%
9 11337
7.0%
Lowercase Letter
ValueCountFrequency (%)
e 1
25.0%
b 1
25.0%
u 1
25.0%
n 1
25.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%
Other Punctuation
ValueCountFrequency (%)
: 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 1
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 163074
50.2%
Latin 161805
49.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 17641
 
10.9%
D 12972
 
8.0%
K 11557
 
7.1%
B 10961
 
6.8%
C 9841
 
6.1%
G 8160
 
5.0%
F 7960
 
4.9%
H 7770
 
4.8%
J 7665
 
4.7%
A 7143
 
4.4%
Other values (18) 60135
37.2%
Common
ValueCountFrequency (%)
0 26451
16.2%
5 19722
12.1%
4 18319
11.2%
1 16211
9.9%
2 15987
9.8%
6 15976
9.8%
3 13938
8.5%
7 12684
7.8%
8 12440
7.6%
9 11337
7.0%
Other values (6) 9
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 324879
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 26451
 
8.1%
5 19722
 
6.1%
4 18319
 
5.6%
E 17641
 
5.4%
1 16211
 
5.0%
2 15987
 
4.9%
6 15976
 
4.9%
3 13938
 
4.3%
D 12972
 
4.0%
7 12684
 
3.9%
Other values (34) 154978
47.7%

FIRE_NAME
Categorical

HIGH CARDINALITY
MISSING

Distinct151885
Distinct (%)66.0%
Missing240071
Missing (%)51.1%
Memory size7.2 MiB
GRASS FIRE
 
983
UNKNOWN
 
821
LOCAL
 
535
STATE
 
346
LOCAL FIRE
 
188
Other values (151880)
227172 

Length

Max length50
Median length44
Mean length11.571114
Min length1

Characters and Unicode

Total characters2661877
Distinct characters64
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique131270 ?
Unique (%)57.1%

Sample

1st rowMESSY PLACE
2nd rowELSINORE MTN (ADAM)
3rd rowWHOOPEE CREEK
4th rowI-15 NR18
5th rowFALSE ALARM

Common Values

ValueCountFrequency (%)
GRASS FIRE 983
 
0.2%
UNKNOWN 821
 
0.2%
LOCAL 535
 
0.1%
STATE 346
 
0.1%
LOCAL FIRE 188
 
< 0.1%
COTTONWOOD 167
 
< 0.1%
WILLOW 158
 
< 0.1%
POWERLINE 155
 
< 0.1%
ROCK 155
 
< 0.1%
LOCAL 151
 
< 0.1%
Other values (151875) 226386
48.2%
(Missing) 240071
51.1%

Length

2022-12-07T19:46:31.730244image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
fire 16499
 
4.1%
creek 6941
 
1.7%
rd 6255
 
1.6%
2 5372
 
1.4%
road 4905
 
1.2%
4822
 
1.2%
1 2994
 
0.8%
lake 2466
 
0.6%
hwy 2358
 
0.6%
grass 2262
 
0.6%
Other values (81854) 342705
86.2%

Most occurring characters

ValueCountFrequency (%)
492343
18.5%
E 192066
 
7.2%
R 167817
 
6.3%
A 143185
 
5.4%
O 122850
 
4.6%
N 113099
 
4.2%
I 112657
 
4.2%
L 109792
 
4.1%
S 95485
 
3.6%
T 92371
 
3.5%
Other values (54) 1020212
38.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1771130
66.5%
Space Separator 492343
 
18.5%
Decimal Number 326827
 
12.3%
Dash Punctuation 40716
 
1.5%
Other Punctuation 23090
 
0.9%
Open Punctuation 3632
 
0.1%
Close Punctuation 3614
 
0.1%
Connector Punctuation 489
 
< 0.1%
Modifier Symbol 22
 
< 0.1%
Math Symbol 8
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 192066
 
10.8%
R 167817
 
9.5%
A 143185
 
8.1%
O 122850
 
6.9%
N 113099
 
6.4%
I 112657
 
6.4%
L 109792
 
6.2%
S 95485
 
5.4%
T 92371
 
5.2%
C 71741
 
4.1%
Other values (17) 550067
31.1%
Other Punctuation
ValueCountFrequency (%)
# 7760
33.6%
. 7208
31.2%
/ 3811
16.5%
, 1123
 
4.9%
' 1086
 
4.7%
& 998
 
4.3%
" 751
 
3.3%
@ 224
 
1.0%
! 37
 
0.2%
? 33
 
0.1%
Other values (4) 59
 
0.3%
Decimal Number
ValueCountFrequency (%)
0 87013
26.6%
1 60555
18.5%
2 54388
16.6%
3 26366
 
8.1%
4 21864
 
6.7%
5 20422
 
6.2%
6 15374
 
4.7%
7 13707
 
4.2%
8 13599
 
4.2%
9 13539
 
4.1%
Open Punctuation
ValueCountFrequency (%)
( 3618
99.6%
[ 10
 
0.3%
{ 4
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 3603
99.7%
] 10
 
0.3%
} 1
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
+ 6
75.0%
= 2
 
25.0%
Space Separator
ValueCountFrequency (%)
492343
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 40716
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 489
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 22
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1771130
66.5%
Common 890747
33.5%

Most frequent character per script

Common
ValueCountFrequency (%)
492343
55.3%
0 87013
 
9.8%
1 60555
 
6.8%
2 54388
 
6.1%
- 40716
 
4.6%
3 26366
 
3.0%
4 21864
 
2.5%
5 20422
 
2.3%
6 15374
 
1.7%
7 13707
 
1.5%
Other values (27) 57999
 
6.5%
Latin
ValueCountFrequency (%)
E 192066
 
10.8%
R 167817
 
9.5%
A 143185
 
8.1%
O 122850
 
6.9%
N 113099
 
6.4%
I 112657
 
6.4%
L 109792
 
6.2%
S 95485
 
5.4%
T 92371
 
5.2%
C 71741
 
4.1%
Other values (17) 550067
31.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2661873
> 99.9%
None 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
492343
18.5%
E 192066
 
7.2%
R 167817
 
6.3%
A 143185
 
5.4%
O 122850
 
4.6%
N 113099
 
4.2%
I 112657
 
4.2%
L 109792
 
4.1%
S 95485
 
3.6%
T 92371
 
3.5%
Other values (53) 1020208
38.3%
None
ValueCountFrequency (%)
Ñ 4
100.0%

ICS_209_INCIDENT_NUMBER
Categorical

HIGH CARDINALITY
MISSING
UNIFORM

Distinct6070
Distinct (%)93.5%
Missing463627
Missing (%)98.6%
Memory size7.2 MiB
OR-UPF-009121
 
14
WA-OWF-000583
 
13
OK-OSA-100020
 
13
ID-PAF-006068
 
12
MT-BRF-000135
 
9
Other values (6065)
6428 

Length

Max length19
Median length17
Mean length12.122823
Min length5

Characters and Unicode

Total characters78665
Distinct characters46
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5855 ?
Unique (%)90.2%

Sample

1st rowMI-MIS-013
2nd rowUT-SLD-1519
3rd rowMT-BDF-117
4th rowOR-95S-754
5th rowMO-MTF-000106

Common Values

ValueCountFrequency (%)
OR-UPF-009121 14
 
< 0.1%
WA-OWF-000583 13
 
< 0.1%
OK-OSA-100020 13
 
< 0.1%
ID-PAF-006068 12
 
< 0.1%
MT-BRF-000135 9
 
< 0.1%
MT-FNF-037 9
 
< 0.1%
CA-MDF-000388 9
 
< 0.1%
CA-MNF-000663 8
 
< 0.1%
CA-SRF-1120 8
 
< 0.1%
WA-MSF-00177 8
 
< 0.1%
Other values (6060) 6386
 
1.4%
(Missing) 463627
98.6%

Length

2022-12-07T19:46:31.902120image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
or-upf-009121 14
 
0.2%
ok-osa-100020 13
 
0.2%
wa-owf-000583 13
 
0.2%
id-paf-006068 12
 
0.2%
mt-brf-000135 9
 
0.1%
mt-fnf-037 9
 
0.1%
ca-mdf-000388 9
 
0.1%
wa-msf-00177 8
 
0.1%
ca-srf-1120 8
 
0.1%
ca-mnf-000663 8
 
0.1%
Other values (6082) 6420
98.4%

Most occurring characters

ValueCountFrequency (%)
- 12597
16.0%
0 11651
 
14.8%
1 5012
 
6.4%
2 4008
 
5.1%
S 3615
 
4.6%
F 2756
 
3.5%
3 2686
 
3.4%
A 2558
 
3.3%
4 2315
 
2.9%
6 2226
 
2.8%
Other values (36) 29241
37.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 36150
46.0%
Uppercase Letter 29598
37.6%
Dash Punctuation 12597
 
16.0%
Space Separator 307
 
0.4%
Lowercase Letter 5
 
< 0.1%
Other Punctuation 4
 
< 0.1%
Connector Punctuation 3
 
< 0.1%
Currency Symbol 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 3615
12.2%
F 2756
 
9.3%
A 2558
 
8.6%
C 1997
 
6.7%
N 1782
 
6.0%
M 1701
 
5.7%
D 1650
 
5.6%
T 1646
 
5.6%
O 1566
 
5.3%
K 1339
 
4.5%
Other values (16) 8988
30.4%
Decimal Number
ValueCountFrequency (%)
0 11651
32.2%
1 5012
13.9%
2 4008
 
11.1%
3 2686
 
7.4%
4 2315
 
6.4%
6 2226
 
6.2%
5 2096
 
5.8%
8 2081
 
5.8%
7 2039
 
5.6%
9 2036
 
5.6%
Lowercase Letter
ValueCountFrequency (%)
p 2
40.0%
y 1
20.0%
h 1
20.0%
s 1
20.0%
Other Punctuation
ValueCountFrequency (%)
/ 3
75.0%
? 1
 
25.0%
Dash Punctuation
ValueCountFrequency (%)
- 12597
100.0%
Space Separator
ValueCountFrequency (%)
307
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 49062
62.4%
Latin 29603
37.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 3615
12.2%
F 2756
 
9.3%
A 2558
 
8.6%
C 1997
 
6.7%
N 1782
 
6.0%
M 1701
 
5.7%
D 1650
 
5.6%
T 1646
 
5.6%
O 1566
 
5.3%
K 1339
 
4.5%
Other values (20) 8993
30.4%
Common
ValueCountFrequency (%)
- 12597
25.7%
0 11651
23.7%
1 5012
 
10.2%
2 4008
 
8.2%
3 2686
 
5.5%
4 2315
 
4.7%
6 2226
 
4.5%
5 2096
 
4.3%
8 2081
 
4.2%
7 2039
 
4.2%
Other values (6) 2351
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 78665
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 12597
16.0%
0 11651
 
14.8%
1 5012
 
6.4%
2 4008
 
5.1%
S 3615
 
4.6%
F 2756
 
3.5%
3 2686
 
3.4%
A 2558
 
3.3%
4 2315
 
2.9%
6 2226
 
2.8%
Other values (36) 29241
37.2%

ICS_209_NAME
Categorical

HIGH CARDINALITY
MISSING
UNIFORM

Distinct5723
Distinct (%)88.2%
Missing463627
Missing (%)98.6%
Memory size7.2 MiB
Tiller Complex
 
14
OSAGE-MIAMI COMPLEX
 
13
YAKIMA COMPLEX
 
13
South Fork Complex
 
12
Muldoon Complex
 
9
Other values (5718)
6428 

Length

Max length37
Median length25
Mean length10.920943
Min length2

Characters and Unicode

Total characters70866
Distinct characters73
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5247 ?
Unique (%)80.9%

Sample

1st rowMackinac #001
2nd rowConcrete
3rd rowMussigbrod Complex
4th rowPark
5th rowTurner South

Common Values

ValueCountFrequency (%)
Tiller Complex 14
 
< 0.1%
OSAGE-MIAMI COMPLEX 13
 
< 0.1%
YAKIMA COMPLEX 13
 
< 0.1%
South Fork Complex 12
 
< 0.1%
Muldoon Complex 9
 
< 0.1%
Ltl Salmon CK Fire Use Complex 9
 
< 0.1%
Selway-Salmon WFU Complex 9
 
< 0.1%
Middle Fork Complex 8
 
< 0.1%
Mad Complex 8
 
< 0.1%
Gold Hill Complex 8
 
< 0.1%
Other values (5713) 6386
 
1.4%
(Missing) 463627
98.6%

Length

2022-12-07T19:46:32.058370image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
complex 752
 
6.2%
fire 579
 
4.8%
creek 484
 
4.0%
road 196
 
1.6%
lake 133
 
1.1%
fork 112
 
0.9%
mountain 112
 
0.9%
2 103
 
0.9%
ridge 91
 
0.8%
river 76
 
0.6%
Other values (4221) 9436
78.2%

Most occurring characters

ValueCountFrequency (%)
5683
 
8.0%
e 5039
 
7.1%
o 3154
 
4.5%
r 3056
 
4.3%
a 3019
 
4.3%
l 2623
 
3.7%
E 2574
 
3.6%
i 2532
 
3.6%
C 2524
 
3.6%
R 2398
 
3.4%
Other values (63) 38264
54.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 34885
49.2%
Uppercase Letter 28741
40.6%
Space Separator 5683
 
8.0%
Decimal Number 1132
 
1.6%
Other Punctuation 257
 
0.4%
Dash Punctuation 122
 
0.2%
Open Punctuation 23
 
< 0.1%
Close Punctuation 23
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 5039
14.4%
o 3154
 
9.0%
r 3056
 
8.8%
a 3019
 
8.7%
l 2623
 
7.5%
i 2532
 
7.3%
n 2377
 
6.8%
t 1732
 
5.0%
s 1232
 
3.5%
d 1134
 
3.3%
Other values (16) 8987
25.8%
Uppercase Letter
ValueCountFrequency (%)
E 2574
 
9.0%
C 2524
 
8.8%
R 2398
 
8.3%
L 1796
 
6.2%
A 1762
 
6.1%
O 1716
 
6.0%
S 1643
 
5.7%
I 1374
 
4.8%
N 1335
 
4.6%
M 1300
 
4.5%
Other values (16) 10319
35.9%
Decimal Number
ValueCountFrequency (%)
2 232
20.5%
1 218
19.3%
0 146
12.9%
3 105
9.3%
4 100
8.8%
5 89
 
7.9%
7 71
 
6.3%
6 60
 
5.3%
9 57
 
5.0%
8 54
 
4.8%
Other Punctuation
ValueCountFrequency (%)
. 92
35.8%
# 72
28.0%
/ 43
16.7%
' 38
14.8%
& 10
 
3.9%
@ 1
 
0.4%
\ 1
 
0.4%
Space Separator
ValueCountFrequency (%)
5683
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 122
100.0%
Open Punctuation
ValueCountFrequency (%)
( 23
100.0%
Close Punctuation
ValueCountFrequency (%)
) 23
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 63626
89.8%
Common 7240
 
10.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 5039
 
7.9%
o 3154
 
5.0%
r 3056
 
4.8%
a 3019
 
4.7%
l 2623
 
4.1%
E 2574
 
4.0%
i 2532
 
4.0%
C 2524
 
4.0%
R 2398
 
3.8%
n 2377
 
3.7%
Other values (42) 34330
54.0%
Common
ValueCountFrequency (%)
5683
78.5%
2 232
 
3.2%
1 218
 
3.0%
0 146
 
2.0%
- 122
 
1.7%
3 105
 
1.5%
4 100
 
1.4%
. 92
 
1.3%
5 89
 
1.2%
# 72
 
1.0%
Other values (11) 381
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 70866
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5683
 
8.0%
e 5039
 
7.1%
o 3154
 
4.5%
r 3056
 
4.3%
a 3019
 
4.3%
l 2623
 
3.7%
E 2574
 
3.6%
i 2532
 
3.6%
C 2524
 
3.6%
R 2398
 
3.4%
Other values (63) 38264
54.0%

MTBS_ID
Categorical

HIGH CARDINALITY
MISSING
UNIFORM

Distinct2733
Distinct (%)98.2%
Missing467334
Missing (%)99.4%
Memory size7.2 MiB
ID4542411459020120730
 
6
KY3686008359020011102
 
5
CA3985212144420080814
 
3
NV4011611706720070716
 
3
CA4064212358620150731
 
3
Other values (2728)
2762 

Length

Max length29
Median length21
Mean length21.049245
Min length13

Characters and Unicode

Total characters58559
Distinct characters37
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2697 ?
Unique (%)96.9%

Sample

1st rowCA3344711729819970904
2nd rowWA4835212060019940725
3rd rowUT4168911240520010713
4th rowLA3205609306220110902
5th rowKS3892010181120060731

Common Values

ValueCountFrequency (%)
ID4542411459020120730 6
 
< 0.1%
KY3686008359020011102 5
 
< 0.1%
CA3985212144420080814 3
 
< 0.1%
NV4011611706720070716 3
 
< 0.1%
CA4064212358620150731 3
 
< 0.1%
ID4568011472320070706 3
 
< 0.1%
TX3209710116720110227 3
 
< 0.1%
NV3724211430320050622 3
 
< 0.1%
KY3685008330220001102 2
 
< 0.1%
WV3795308212620011108 2
 
< 0.1%
Other values (2723) 2749
 
0.6%
(Missing) 467334
99.4%

Length

2022-12-07T19:46:32.214775image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
id4542411459020120730 6
 
0.2%
ky3686008359020011102 5
 
0.2%
ca3985212144420080814 3
 
0.1%
nv4011611706720070716 3
 
0.1%
ca4064212358620150731 3
 
0.1%
id4568011472320070706 3
 
0.1%
tx3209710116720110227 3
 
0.1%
nv3724211430320050622 3
 
0.1%
nv3984111739119990804 2
 
0.1%
tx2883309556620080621 2
 
0.1%
Other values (2723) 2749
98.8%

Most occurring characters

ValueCountFrequency (%)
0 10983
18.8%
1 8641
14.8%
2 6321
10.8%
3 4315
 
7.4%
9 4287
 
7.3%
4 4195
 
7.2%
8 3587
 
6.1%
6 3371
 
5.8%
5 3233
 
5.5%
7 3184
 
5.4%
Other values (27) 6442
11.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 52117
89.0%
Uppercase Letter 6086
 
10.4%
Dash Punctuation 356
 
0.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 869
14.3%
T 505
 
8.3%
N 479
 
7.9%
M 436
 
7.2%
C 395
 
6.5%
K 382
 
6.3%
D 370
 
6.1%
O 324
 
5.3%
I 314
 
5.2%
V 265
 
4.4%
Other values (16) 1747
28.7%
Decimal Number
ValueCountFrequency (%)
0 10983
21.1%
1 8641
16.6%
2 6321
12.1%
3 4315
 
8.3%
9 4287
 
8.2%
4 4195
 
8.0%
8 3587
 
6.9%
6 3371
 
6.5%
5 3233
 
6.2%
7 3184
 
6.1%
Dash Punctuation
ValueCountFrequency (%)
- 356
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 52473
89.6%
Latin 6086
 
10.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 869
14.3%
T 505
 
8.3%
N 479
 
7.9%
M 436
 
7.2%
C 395
 
6.5%
K 382
 
6.3%
D 370
 
6.1%
O 324
 
5.3%
I 314
 
5.2%
V 265
 
4.4%
Other values (16) 1747
28.7%
Common
ValueCountFrequency (%)
0 10983
20.9%
1 8641
16.5%
2 6321
12.0%
3 4315
 
8.2%
9 4287
 
8.2%
4 4195
 
8.0%
8 3587
 
6.8%
6 3371
 
6.4%
5 3233
 
6.2%
7 3184
 
6.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 58559
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 10983
18.8%
1 8641
14.8%
2 6321
10.8%
3 4315
 
7.4%
9 4287
 
7.3%
4 4195
 
7.2%
8 3587
 
6.1%
6 3371
 
5.8%
5 3233
 
5.5%
7 3184
 
5.4%
Other values (27) 6442
11.0%

MTBS_FIRE_NAME
Categorical

HIGH CARDINALITY
MISSING

Distinct2362
Distinct (%)84.9%
Missing467334
Missing (%)99.4%
Memory size7.2 MiB
UNNAMED
 
190
MUSTANG COMPLEX
 
6
WILDCAT
 
5
CAMP CREEK
 
4
CANYON
 
4
Other values (2357)
2573 

Length

Max length49
Median length42
Mean length10.484543
Min length2

Characters and Unicode

Total characters29168
Distinct characters48
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2185 ?
Unique (%)78.5%

Sample

1st rowMARGARITA
2nd rowBUTTE CK#2
3rd rowTHIOKOL
4th rowTD13
5th rowUNNAMED

Common Values

ValueCountFrequency (%)
UNNAMED 190
 
< 0.1%
MUSTANG COMPLEX 6
 
< 0.1%
WILDCAT 5
 
< 0.1%
CAMP CREEK 4
 
< 0.1%
CANYON 4
 
< 0.1%
BOULDER 4
 
< 0.1%
WINDMILL 4
 
< 0.1%
COUNTY LINE 4
 
< 0.1%
ANTELOPE COMPLEX 4
 
< 0.1%
NORTH FORK 4
 
< 0.1%
Other values (2352) 2553
 
0.5%
(Missing) 467334
99.4%

Length

2022-12-07T19:46:32.402294image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
complex 239
 
5.1%
creek 191
 
4.0%
unnamed 190
 
4.0%
fire 65
 
1.4%
2 65
 
1.4%
lake 52
 
1.1%
river 46
 
1.0%
fork 43
 
0.9%
canyon 42
 
0.9%
mountain 39
 
0.8%
Other values (2160) 3751
79.4%

Most occurring characters

ValueCountFrequency (%)
E 3038
 
10.4%
A 2125
 
7.3%
R 2016
 
6.9%
N 1947
 
6.7%
1946
 
6.7%
O 1927
 
6.6%
L 1669
 
5.7%
I 1499
 
5.1%
C 1298
 
4.5%
T 1247
 
4.3%
Other values (38) 10456
35.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 26296
90.2%
Space Separator 1946
 
6.7%
Decimal Number 466
 
1.6%
Open Punctuation 164
 
0.6%
Close Punctuation 164
 
0.6%
Other Punctuation 77
 
0.3%
Dash Punctuation 52
 
0.2%
Lowercase Letter 3
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 3038
 
11.6%
A 2125
 
8.1%
R 2016
 
7.7%
N 1947
 
7.4%
O 1927
 
7.3%
L 1669
 
6.3%
I 1499
 
5.7%
C 1298
 
4.9%
T 1247
 
4.7%
S 1203
 
4.6%
Other values (16) 8327
31.7%
Decimal Number
ValueCountFrequency (%)
2 112
24.0%
1 79
17.0%
0 53
11.4%
3 49
10.5%
4 36
 
7.7%
5 33
 
7.1%
7 32
 
6.9%
6 28
 
6.0%
8 25
 
5.4%
9 19
 
4.1%
Other Punctuation
ValueCountFrequency (%)
# 35
45.5%
. 19
24.7%
' 13
 
16.9%
/ 9
 
11.7%
& 1
 
1.3%
Lowercase Letter
ValueCountFrequency (%)
a 1
33.3%
n 1
33.3%
d 1
33.3%
Space Separator
ValueCountFrequency (%)
1946
100.0%
Open Punctuation
ValueCountFrequency (%)
( 164
100.0%
Close Punctuation
ValueCountFrequency (%)
) 164
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 52
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 26299
90.2%
Common 2869
 
9.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 3038
 
11.6%
A 2125
 
8.1%
R 2016
 
7.7%
N 1947
 
7.4%
O 1927
 
7.3%
L 1669
 
6.3%
I 1499
 
5.7%
C 1298
 
4.9%
T 1247
 
4.7%
S 1203
 
4.6%
Other values (19) 8330
31.7%
Common
ValueCountFrequency (%)
1946
67.8%
( 164
 
5.7%
) 164
 
5.7%
2 112
 
3.9%
1 79
 
2.8%
0 53
 
1.8%
- 52
 
1.8%
3 49
 
1.7%
4 36
 
1.3%
# 35
 
1.2%
Other values (9) 179
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 29168
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 3038
 
10.4%
A 2125
 
7.3%
R 2016
 
6.9%
N 1947
 
6.7%
1946
 
6.7%
O 1927
 
6.6%
L 1669
 
5.7%
I 1499
 
5.1%
C 1298
 
4.5%
T 1247
 
4.3%
Other values (38) 10456
35.8%

COMPLEX_NAME
Categorical

HIGH CARDINALITY
MISSING

Distinct690
Distinct (%)52.4%
Missing468800
Missing (%)99.7%
Memory size7.2 MiB
TILLER COMPLEX
 
20
OSAGE-MIAMI COMPLEX
 
13
SOUTH FORK COMPLEX
 
13
YAKIMA COMPLEX
 
13
VALLEY COMPLEX
 
12
Other values (685)
1245 

Length

Max length43
Median length33
Mean length17.550912
Min length7

Characters and Unicode

Total characters23097
Distinct characters46
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique417 ?
Unique (%)31.7%

Sample

1st rowVALLEY COMPLEX
2nd rowGILBERT COMPLEX
3rd row107 COMPLEX
4th rowYOLLA BOLLY WFU COMPLEX
5th rowHERMAN COMPLEX

Common Values

ValueCountFrequency (%)
TILLER COMPLEX 20
 
< 0.1%
OSAGE-MIAMI COMPLEX 13
 
< 0.1%
SOUTH FORK COMPLEX 13
 
< 0.1%
YAKIMA COMPLEX 13
 
< 0.1%
VALLEY COMPLEX 12
 
< 0.1%
CLEAR/NEZ COMPLEX 9
 
< 0.1%
SELWAY-SALMON WFU COMPLEX 9
 
< 0.1%
MULDOON COMPLEX 9
 
< 0.1%
LITTLE SALMON CREEK FIRE USE COMPLEX 9
 
< 0.1%
YOLLA BOLLY COMPLEX 2008 8
 
< 0.1%
Other values (680) 1201
 
0.3%
(Missing) 468800
99.7%

Length

2022-12-07T19:46:32.574149image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
complex 1305
38.4%
creek 57
 
1.7%
lake 47
 
1.4%
lightning 46
 
1.4%
fork 34
 
1.0%
south 30
 
0.9%
wfu 29
 
0.9%
river 28
 
0.8%
fire 26
 
0.8%
tiller 20
 
0.6%
Other values (743) 1777
52.3%

Most occurring characters

ValueCountFrequency (%)
E 2588
11.2%
L 2173
 
9.4%
O 2116
 
9.2%
2069
 
9.0%
C 1735
 
7.5%
M 1675
 
7.3%
P 1535
 
6.6%
X 1330
 
5.8%
A 979
 
4.2%
R 883
 
3.8%
Other values (36) 6014
26.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 20641
89.4%
Space Separator 2096
 
9.1%
Decimal Number 239
 
1.0%
Dash Punctuation 58
 
0.3%
Other Punctuation 53
 
0.2%
Close Punctuation 5
 
< 0.1%
Open Punctuation 5
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 2588
12.5%
L 2173
10.5%
O 2116
10.3%
C 1735
 
8.4%
M 1675
 
8.1%
P 1535
 
7.4%
X 1330
 
6.4%
A 979
 
4.7%
R 883
 
4.3%
I 721
 
3.5%
Other values (16) 4906
23.8%
Decimal Number
ValueCountFrequency (%)
0 72
30.1%
2 47
19.7%
7 24
 
10.0%
1 19
 
7.9%
8 17
 
7.1%
5 15
 
6.3%
4 15
 
6.3%
9 12
 
5.0%
3 12
 
5.0%
6 6
 
2.5%
Other Punctuation
ValueCountFrequency (%)
/ 29
54.7%
. 13
24.5%
& 5
 
9.4%
' 4
 
7.5%
# 2
 
3.8%
Space Separator
ValueCountFrequency (%)
2069
98.7%
  27
 
1.3%
Dash Punctuation
ValueCountFrequency (%)
- 58
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 20641
89.4%
Common 2456
 
10.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 2588
12.5%
L 2173
10.5%
O 2116
10.3%
C 1735
 
8.4%
M 1675
 
8.1%
P 1535
 
7.4%
X 1330
 
6.4%
A 979
 
4.7%
R 883
 
4.3%
I 721
 
3.5%
Other values (16) 4906
23.8%
Common
ValueCountFrequency (%)
2069
84.2%
0 72
 
2.9%
- 58
 
2.4%
2 47
 
1.9%
/ 29
 
1.2%
  27
 
1.1%
7 24
 
1.0%
1 19
 
0.8%
8 17
 
0.7%
5 15
 
0.6%
Other values (10) 79
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23070
99.9%
None 27
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 2588
11.2%
L 2173
 
9.4%
O 2116
 
9.2%
2069
 
9.0%
C 1735
 
7.5%
M 1675
 
7.3%
P 1535
 
6.7%
X 1330
 
5.8%
A 979
 
4.2%
R 883
 
3.8%
Other values (35) 5987
26.0%
None
ValueCountFrequency (%)
  27
100.0%

FIRE_YEAR
Real number (ℝ)

Distinct24
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2003.7116
Minimum1992
Maximum2015
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.2 MiB
2022-12-07T19:46:32.699150image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1992
5-th percentile1993
Q11998
median2004
Q32009
95-th percentile2014
Maximum2015
Range23
Interquartile range (IQR)11

Descriptive statistics

Standard deviation6.6597798
Coefficient of variation (CV)0.0033237218
Kurtosis-1.1129801
Mean2003.7116
Median Absolute Deviation (MAD)5
Skewness-0.057481255
Sum9.4197686 × 108
Variance44.352667
MonotonicityNot monotonic
2022-12-07T19:46:32.839781image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
2006 28489
 
6.1%
2000 24344
 
5.2%
2007 23955
 
5.1%
2011 22609
 
4.8%
1999 22343
 
4.8%
2005 22116
 
4.7%
2001 21711
 
4.6%
2008 21339
 
4.5%
2010 20312
 
4.3%
2009 19478
 
4.1%
Other values (14) 243420
51.8%
ValueCountFrequency (%)
1992 17017
3.6%
1993 15476
3.3%
1994 18859
4.0%
1995 17654
3.8%
1996 18949
4.0%
1997 15450
3.3%
1998 17184
3.7%
1999 22343
4.8%
2000 24344
5.2%
2001 21711
4.6%
ValueCountFrequency (%)
2015 18571
4.0%
2014 16768
3.6%
2013 16251
3.5%
2012 18245
3.9%
2011 22609
4.8%
2010 20312
4.3%
2009 19478
4.1%
2008 21339
4.5%
2007 23955
5.1%
2006 28489
6.1%

DISCOVERY_DATE
Real number (ℝ)

Distinct8760
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2453064.2
Minimum2448622.5
Maximum2457387.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.2 MiB
2022-12-07T19:46:32.996022image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum2448622.5
5-th percentile2449147.5
Q12451087.5
median2453177.5
Q32455038.5
95-th percentile2456866.5
Maximum2457387.5
Range8765
Interquartile range (IQR)3951

Descriptive statistics

Standard deviation2433.5043
Coefficient of variation (CV)0.00099202632
Kurtosis-1.1056343
Mean2453064.2
Median Absolute Deviation (MAD)1965
Skewness-0.058136169
Sum1.1532247 × 1012
Variance5921943.1
MonotonicityNot monotonic
2022-12-07T19:46:33.152282image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2454506.5 303
 
0.1%
2455611.5 294
 
0.1%
2453441.5 283
 
0.1%
2453798.5 273
 
0.1%
2456112.5 259
 
0.1%
2453799.5 258
 
0.1%
2449430.5 257
 
0.1%
2449773.5 255
 
0.1%
2448683.5 249
 
0.1%
2453476.5 245
 
0.1%
Other values (8750) 467440
99.4%
ValueCountFrequency (%)
2448622.5 34
< 0.1%
2448623.5 12
 
< 0.1%
2448624.5 15
 
< 0.1%
2448625.5 16
< 0.1%
2448626.5 13
 
< 0.1%
2448627.5 23
< 0.1%
2448628.5 38
< 0.1%
2448629.5 20
< 0.1%
2448630.5 10
 
< 0.1%
2448631.5 18
< 0.1%
ValueCountFrequency (%)
2457387.5 8
< 0.1%
2457386.5 8
< 0.1%
2457385.5 5
 
< 0.1%
2457384.5 3
 
< 0.1%
2457383.5 3
 
< 0.1%
2457382.5 13
< 0.1%
2457381.5 1
 
< 0.1%
2457380.5 8
< 0.1%
2457379.5 11
< 0.1%
2457378.5 13
< 0.1%

DISCOVERY_DOY
Real number (ℝ)

Distinct366
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean164.70571
Minimum1
Maximum366
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.2 MiB
2022-12-07T19:46:33.324279image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile31
Q189
median164
Q3230
95-th percentile322
Maximum366
Range365
Interquartile range (IQR)141

Descriptive statistics

Standard deviation89.971552
Coefficient of variation (CV)0.54625641
Kurtosis-0.88832905
Mean164.70571
Median Absolute Deviation (MAD)71
Skewness0.22516879
Sum77430791
Variance8094.8801
MonotonicityNot monotonic
2022-12-07T19:46:33.480549image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
185 3159
 
0.7%
186 2956
 
0.6%
108 2386
 
0.5%
101 2353
 
0.5%
100 2295
 
0.5%
83 2275
 
0.5%
67 2241
 
0.5%
109 2236
 
0.5%
107 2234
 
0.5%
95 2199
 
0.5%
Other values (356) 445782
94.8%
ValueCountFrequency (%)
1 1000
0.2%
2 675
0.1%
3 625
0.1%
4 616
0.1%
5 667
0.1%
6 639
0.1%
7 756
0.2%
8 684
0.1%
9 597
0.1%
10 565
0.1%
ValueCountFrequency (%)
366 132
 
< 0.1%
365 729
0.2%
364 518
0.1%
363 506
0.1%
362 602
0.1%
361 622
0.1%
360 518
0.1%
359 305
0.1%
358 406
0.1%
357 437
0.1%

DISCOVERY_TIME
Real number (ℝ)

HIGH CORRELATION
MISSING

Distinct1440
Distinct (%)0.6%
Missing220721
Missing (%)47.0%
Infinite0
Infinite (%)0.0%
Mean1452.2666
Minimum0
Maximum2359
Zeros175
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size7.2 MiB
2022-12-07T19:46:33.652432image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile740
Q11239
median1457
Q31708
95-th percentile2100
Maximum2359
Range2359
Interquartile range (IQR)469

Descriptive statistics

Standard deviation406.50047
Coefficient of variation (CV)0.27990761
Kurtosis1.4359067
Mean1452.2666
Median Absolute Deviation (MAD)235
Skewness-0.70117019
Sum3.6218803 × 108
Variance165242.63
MonotonicityNot monotonic
2022-12-07T19:46:33.824738image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1400 5273
 
1.1%
1500 5057
 
1.1%
1600 4410
 
0.9%
1300 4341
 
0.9%
1700 3509
 
0.7%
1200 3477
 
0.7%
1530 3330
 
0.7%
1430 3222
 
0.7%
1800 2985
 
0.6%
1630 2904
 
0.6%
Other values (1430) 210887
44.9%
(Missing) 220721
47.0%
ValueCountFrequency (%)
0 175
< 0.1%
1 199
< 0.1%
2 28
 
< 0.1%
3 24
 
< 0.1%
4 20
 
< 0.1%
5 63
 
< 0.1%
6 15
 
< 0.1%
7 16
 
< 0.1%
8 24
 
< 0.1%
9 32
 
< 0.1%
ValueCountFrequency (%)
2359 76
< 0.1%
2358 28
 
< 0.1%
2357 38
< 0.1%
2356 22
 
< 0.1%
2355 49
< 0.1%
2354 27
 
< 0.1%
2353 23
 
< 0.1%
2352 23
 
< 0.1%
2351 14
 
< 0.1%
2350 80
< 0.1%

STAT_CAUSE_CODE
Real number (ℝ)

Distinct13
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.9771078
Minimum1
Maximum13
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.2 MiB
2022-12-07T19:46:33.965363image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median5
Q39
95-th percentile13
Maximum13
Range12
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.4839815
Coefficient of variation (CV)0.58288752
Kurtosis-0.60351971
Mean5.9771078
Median Absolute Deviation (MAD)3
Skewness0.31029225
Sum2809934
Variance12.138127
MonotonicityNot monotonic
2022-12-07T19:46:34.087863image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
5 107275
22.8%
9 81400
17.3%
7 70126
14.9%
1 69786
14.8%
13 41568
 
8.8%
2 36954
 
7.9%
4 18980
 
4.0%
8 15215
 
3.2%
3 13143
 
2.8%
6 8300
 
1.8%
Other values (3) 7369
 
1.6%
ValueCountFrequency (%)
1 69786
14.8%
2 36954
 
7.9%
3 13143
 
2.8%
4 18980
 
4.0%
5 107275
22.8%
6 8300
 
1.8%
7 70126
14.9%
8 15215
 
3.2%
9 81400
17.3%
10 2871
 
0.6%
ValueCountFrequency (%)
13 41568
 
8.8%
12 942
 
0.2%
11 3556
 
0.8%
10 2871
 
0.6%
9 81400
17.3%
8 15215
 
3.2%
7 70126
14.9%
6 8300
 
1.8%
5 107275
22.8%
4 18980
 
4.0%

STAT_CAUSE_DESCR
Categorical

Distinct13
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.2 MiB
Debris Burning
107275 
Miscellaneous
81400 
Arson
70126 
Lightning
69786 
Missing/Undefined
41568 
Other values (8)
99961 

Length

Max length17
Median length13
Mean length11.112351
Min length5

Characters and Unicode

Total characters5224094
Distinct characters35
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMiscellaneous
2nd rowMiscellaneous
3rd rowMissing/Undefined
4th rowLightning
5th rowEquipment Use

Common Values

ValueCountFrequency (%)
Debris Burning 107275
22.8%
Miscellaneous 81400
17.3%
Arson 70126
14.9%
Lightning 69786
14.8%
Missing/Undefined 41568
 
8.8%
Equipment Use 36954
 
7.9%
Campfire 18980
 
4.0%
Children 15215
 
3.2%
Smoking 13143
 
2.8%
Railroad 8300
 
1.8%
Other values (3) 7369
 
1.6%

Length

2022-12-07T19:46:34.228509image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
debris 107275
17.5%
burning 107275
17.5%
miscellaneous 81400
13.2%
arson 70126
11.4%
lightning 69786
11.4%
missing/undefined 41568
 
6.8%
equipment 36954
 
6.0%
use 36954
 
6.0%
campfire 18980
 
3.1%
children 15215
 
2.5%
Other values (5) 28812
 
4.7%

Most occurring characters

ValueCountFrequency (%)
n 699220
13.4%
i 659245
12.6%
e 472239
 
9.0%
s 463162
 
8.9%
r 338353
 
6.5%
g 301558
 
5.8%
u 227513
 
4.4%
l 189871
 
3.6%
o 179396
 
3.4%
144229
 
2.8%
Other values (25) 1549308
29.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4382384
83.9%
Uppercase Letter 655913
 
12.6%
Space Separator 144229
 
2.8%
Other Punctuation 41568
 
0.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 699220
16.0%
i 659245
15.0%
e 472239
10.8%
s 463162
10.6%
r 338353
7.7%
g 301558
6.9%
u 227513
 
5.2%
l 189871
 
4.3%
o 179396
 
4.1%
a 116980
 
2.7%
Other values (11) 734847
16.8%
Uppercase Letter
ValueCountFrequency (%)
M 122968
18.7%
D 107275
16.4%
B 107275
16.4%
U 78522
12.0%
A 70126
10.7%
L 69786
10.6%
E 36954
 
5.6%
C 34195
 
5.2%
S 14085
 
2.1%
R 8300
 
1.3%
Other values (2) 6427
 
1.0%
Space Separator
ValueCountFrequency (%)
144229
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 41568
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5038297
96.4%
Common 185797
 
3.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 699220
13.9%
i 659245
13.1%
e 472239
 
9.4%
s 463162
 
9.2%
r 338353
 
6.7%
g 301558
 
6.0%
u 227513
 
4.5%
l 189871
 
3.8%
o 179396
 
3.6%
M 122968
 
2.4%
Other values (23) 1384772
27.5%
Common
ValueCountFrequency (%)
144229
77.6%
/ 41568
 
22.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5224094
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 699220
13.4%
i 659245
12.6%
e 472239
 
9.0%
s 463162
 
8.9%
r 338353
 
6.5%
g 301558
 
5.8%
u 227513
 
4.4%
l 189871
 
3.6%
o 179396
 
3.4%
144229
 
2.8%
Other values (25) 1549308
29.7%

CONT_DATE
Real number (ℝ)

HIGH CORRELATION
MISSING

Distinct8704
Distinct (%)3.5%
Missing222921
Missing (%)47.4%
Infinite0
Infinite (%)0.0%
Mean2453242.3
Minimum2448622.5
Maximum2457388.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.2 MiB
2022-12-07T19:46:34.385434image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum2448622.5
5-th percentile2449059.5
Q12450713.5
median2453471.5
Q32455751.5
95-th percentile2457108.5
Maximum2457388.5
Range8766
Interquartile range (IQR)5038

Descriptive statistics

Standard deviation2684.2119
Coefficient of variation (CV)0.0010941487
Kurtosis-1.321106
Mean2453242.3
Median Absolute Deviation (MAD)2413
Skewness-0.12424131
Sum6.0642924 × 1011
Variance7204993.4
MonotonicityNot monotonic
2022-12-07T19:46:34.554256image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2455611.5 217
 
< 0.1%
2448683.5 176
 
< 0.1%
2457067.5 173
 
< 0.1%
2450309.5 172
 
< 0.1%
2449056.5 164
 
< 0.1%
2455606.5 159
 
< 0.1%
2450137.5 154
 
< 0.1%
2456112.5 152
 
< 0.1%
2456367.5 151
 
< 0.1%
2449430.5 146
 
< 0.1%
Other values (8694) 245531
52.2%
(Missing) 222921
47.4%
ValueCountFrequency (%)
2448622.5 17
< 0.1%
2448623.5 6
 
< 0.1%
2448624.5 9
 
< 0.1%
2448625.5 11
 
< 0.1%
2448626.5 9
 
< 0.1%
2448627.5 18
< 0.1%
2448628.5 32
< 0.1%
2448629.5 12
 
< 0.1%
2448630.5 5
 
< 0.1%
2448631.5 9
 
< 0.1%
ValueCountFrequency (%)
2457388.5 1
 
< 0.1%
2457387.5 7
< 0.1%
2457386.5 6
< 0.1%
2457385.5 6
< 0.1%
2457384.5 4
< 0.1%
2457383.5 3
< 0.1%
2457382.5 7
< 0.1%
2457381.5 1
 
< 0.1%
2457380.5 7
< 0.1%
2457379.5 5
< 0.1%

CONT_DOY
Real number (ℝ)

HIGH CORRELATION
MISSING

Distinct366
Distinct (%)0.1%
Missing222921
Missing (%)47.4%
Infinite0
Infinite (%)0.0%
Mean172.82454
Minimum1
Maximum366
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.2 MiB
2022-12-07T19:46:34.713704image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile39
Q1103
median181
Q3232
95-th percentile316
Maximum366
Range365
Interquartile range (IQR)129

Descriptive statistics

Standard deviation84.257454
Coefficient of variation (CV)0.48753178
Kurtosis-0.79649134
Mean172.82454
Median Absolute Deviation (MAD)65
Skewness0.062250392
Sum42721361
Variance7099.3186
MonotonicityNot monotonic
2022-12-07T19:46:34.885576image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
186 1825
 
0.4%
185 1820
 
0.4%
217 1373
 
0.3%
184 1333
 
0.3%
205 1330
 
0.3%
216 1325
 
0.3%
187 1317
 
0.3%
204 1309
 
0.3%
218 1306
 
0.3%
202 1305
 
0.3%
Other values (356) 232952
49.6%
(Missing) 222921
47.4%
ValueCountFrequency (%)
1 293
0.1%
2 177
< 0.1%
3 212
< 0.1%
4 208
< 0.1%
5 241
0.1%
6 275
0.1%
7 282
0.1%
8 224
< 0.1%
9 234
< 0.1%
10 224
< 0.1%
ValueCountFrequency (%)
366 37
 
< 0.1%
365 195
< 0.1%
364 158
< 0.1%
363 183
< 0.1%
362 237
0.1%
361 214
< 0.1%
360 188
< 0.1%
359 126
< 0.1%
358 160
< 0.1%
357 170
< 0.1%

CONT_TIME
Real number (ℝ)

HIGH CORRELATION
MISSING

Distinct1439
Distinct (%)0.6%
Missing243096
Missing (%)51.7%
Infinite0
Infinite (%)0.0%
Mean1535.1095
Minimum0
Maximum2359
Zeros117
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size7.2 MiB
2022-12-07T19:46:35.041827image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile800
Q11310
median1600
Q31810
95-th percentile2200
Maximum2359
Range2359
Interquartile range (IQR)500

Descriptive statistics

Standard deviation432.21998
Coefficient of variation (CV)0.28155644
Kurtosis1.524699
Mean1535.1095
Median Absolute Deviation (MAD)250
Skewness-0.87705982
Sum3.4850056 × 108
Variance186814.11
MonotonicityNot monotonic
2022-12-07T19:46:35.198077image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1800 9579
 
2.0%
1600 5515
 
1.2%
1700 5200
 
1.1%
1200 4832
 
1.0%
1500 4719
 
1.0%
2000 4403
 
0.9%
1400 4028
 
0.9%
1900 3792
 
0.8%
1630 3638
 
0.8%
1300 3318
 
0.7%
Other values (1429) 177996
37.9%
(Missing) 243096
51.7%
ValueCountFrequency (%)
0 117
< 0.1%
1 136
< 0.1%
2 23
 
< 0.1%
3 17
 
< 0.1%
4 12
 
< 0.1%
5 33
 
< 0.1%
6 15
 
< 0.1%
7 27
 
< 0.1%
8 17
 
< 0.1%
9 15
 
< 0.1%
ValueCountFrequency (%)
2359 895
0.2%
2358 36
 
< 0.1%
2357 38
 
< 0.1%
2356 23
 
< 0.1%
2355 111
 
< 0.1%
2354 23
 
< 0.1%
2353 20
 
< 0.1%
2352 16
 
< 0.1%
2351 19
 
< 0.1%
2350 160
 
< 0.1%

FIRE_SIZE
Real number (ℝ)

Distinct6226
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean68.098671
Minimum1 × 10-5
Maximum537627
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.2 MiB
2022-12-07T19:46:35.369954image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1 × 10-5
5-th percentile0.1
Q10.1
median1
Q33.22
95-th percentile46
Maximum537627
Range537627
Interquartile range (IQR)3.12

Descriptive statistics

Standard deviation2167.2148
Coefficient of variation (CV)31.824627
Kurtosis19564.483
Mean68.098671
Median Absolute Deviation (MAD)0.9
Skewness114.03714
Sum32014275
Variance4696820
MonotonicityNot monotonic
2022-12-07T19:46:35.541853image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.1 115222
24.5%
1 55493
 
11.8%
0.5 28389
 
6.0%
2 27462
 
5.8%
0.2 18879
 
4.0%
3 16510
 
3.5%
5 15520
 
3.3%
0.25 13339
 
2.8%
0.3 13300
 
2.8%
4 9533
 
2.0%
Other values (6216) 156469
33.3%
ValueCountFrequency (%)
1 × 10-51
 
< 0.1%
0.0001 2
 
< 0.1%
0.00022 1
 
< 0.1%
0.001 30
< 0.1%
0.00159 1
 
< 0.1%
0.002 7
 
< 0.1%
0.003 10
 
< 0.1%
0.004 2
 
< 0.1%
0.005 4
 
< 0.1%
0.006 2
 
< 0.1%
ValueCountFrequency (%)
537627 1
< 0.1%
499945 1
< 0.1%
314444 1
< 0.1%
297845 1
< 0.1%
283180 1
< 0.1%
275960 1
< 0.1%
243900 1
< 0.1%
238058 1
< 0.1%
220042.1 1
< 0.1%
213254.1 1
< 0.1%

FIRE_SIZE_CLASS
Categorical

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.2 MiB
B
234942 
A
166778 
C
54739 
D
 
7196
E
 
3560
Other values (2)
 
2901

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters470116
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowB
2nd rowB
3rd rowB
4th rowA
5th rowB

Common Values

ValueCountFrequency (%)
B 234942
50.0%
A 166778
35.5%
C 54739
 
11.6%
D 7196
 
1.5%
E 3560
 
0.8%
F 1975
 
0.4%
G 926
 
0.2%

Length

2022-12-07T19:46:35.682470image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-07T19:46:35.823097image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
b 234942
50.0%
a 166778
35.5%
c 54739
 
11.6%
d 7196
 
1.5%
e 3560
 
0.8%
f 1975
 
0.4%
g 926
 
0.2%

Most occurring characters

ValueCountFrequency (%)
B 234942
50.0%
A 166778
35.5%
C 54739
 
11.6%
D 7196
 
1.5%
E 3560
 
0.8%
F 1975
 
0.4%
G 926
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 470116
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
B 234942
50.0%
A 166778
35.5%
C 54739
 
11.6%
D 7196
 
1.5%
E 3560
 
0.8%
F 1975
 
0.4%
G 926
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 470116
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
B 234942
50.0%
A 166778
35.5%
C 54739
 
11.6%
D 7196
 
1.5%
E 3560
 
0.8%
F 1975
 
0.4%
G 926
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 470116
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
B 234942
50.0%
A 166778
35.5%
C 54739
 
11.6%
D 7196
 
1.5%
E 3560
 
0.8%
F 1975
 
0.4%
G 926
 
0.2%

LATITUDE
Real number (ℝ)

Distinct290432
Distinct (%)61.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean36.778335
Minimum17.944924
Maximum70.1381
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.2 MiB
2022-12-07T19:46:35.963725image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum17.944924
5-th percentile29.19925
Q132.81589
median35.4617
Q340.835
95-th percentile47.192782
Maximum70.1381
Range52.193176
Interquartile range (IQR)8.01911

Descriptive statistics

Standard deviation6.1450804
Coefficient of variation (CV)0.16708425
Kurtosis1.8846865
Mean36.778335
Median Absolute Deviation (MAD)3.5491459
Skewness0.47314848
Sum17290084
Variance37.762013
MonotonicityNot monotonic
2022-12-07T19:46:36.135575image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
47.8666 231
 
< 0.1%
33.3353 171
 
< 0.1%
33.3517 160
 
< 0.1%
17.970539 151
 
< 0.1%
47.8833 140
 
< 0.1%
35.3 135
 
< 0.1%
33.3167 127
 
< 0.1%
36.98888888 122
 
< 0.1%
41.0665 122
 
< 0.1%
34.01194444 121
 
< 0.1%
Other values (290422) 468636
99.7%
ValueCountFrequency (%)
17.944924 1
 
< 0.1%
17.95194 1
 
< 0.1%
17.953889 1
 
< 0.1%
17.956533 43
< 0.1%
17.956667 1
 
< 0.1%
17.957836 1
 
< 0.1%
17.958364 77
< 0.1%
17.95838 5
 
< 0.1%
17.96027778 1
 
< 0.1%
17.96083333 1
 
< 0.1%
ValueCountFrequency (%)
70.1381 1
< 0.1%
69.433 1
< 0.1%
69.3367 1
< 0.1%
69.2644 1
< 0.1%
69.2322 1
< 0.1%
69.2161 1
< 0.1%
69.1817 1
< 0.1%
69.05 1
< 0.1%
69.0458 1
< 0.1%
69.0263 1
< 0.1%

LONGITUDE
Real number (ℝ)

Distinct321751
Distinct (%)68.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-95.725716
Minimum-173.3857
Maximum-65.264175
Zeros0
Zeros (%)0.0%
Negative470116
Negative (%)100.0%
Memory size7.2 MiB
2022-12-07T19:46:36.291992image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-173.3857
5-th percentile-122.03921
Q1-110.42253
median-92.083597
Q3-82.298469
95-th percentile-74.19167
Maximum-65.264175
Range108.12153
Interquartile range (IQR)28.124056

Descriptive statistics

Standard deviation16.742653
Coefficient of variation (CV)-0.17490236
Kurtosis0.14568673
Mean-95.725716
Median Absolute Deviation (MAD)11.025528
Skewness-0.71676199
Sum-45002191
Variance280.31643
MonotonicityNot monotonic
2022-12-07T19:46:36.463843image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-110.4518 216
 
< 0.1%
-123.6845 186
 
< 0.1%
-66.246414 151
 
< 0.1%
-110.4507 133
 
< 0.1%
-66.386131 90
 
< 0.1%
-110.4573 86
 
< 0.1%
-82 86
 
< 0.1%
-81.7 86
 
< 0.1%
-66.114357 85
 
< 0.1%
-95.0169 84
 
< 0.1%
Other values (321741) 468913
99.7%
ValueCountFrequency (%)
-173.3857 1
< 0.1%
-166.8694 1
< 0.1%
-166.1527 1
< 0.1%
-166.0294 1
< 0.1%
-165.8527 1
< 0.1%
-165.569 1
< 0.1%
-165.3932 1
< 0.1%
-165.2526 1
< 0.1%
-164.9526 1
< 0.1%
-164.936 1
< 0.1%
ValueCountFrequency (%)
-65.264175 2
 
< 0.1%
-65.27555556 1
 
< 0.1%
-65.28583333 1
 
< 0.1%
-65.2875 1
 
< 0.1%
-65.288067 1
 
< 0.1%
-65.2883 1
 
< 0.1%
-65.29111111 1
 
< 0.1%
-65.308556 6
< 0.1%
-65.31444444 1
 
< 0.1%
-65.326667 1
 
< 0.1%

OWNER_CODE
Real number (ℝ)

Distinct16
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.604708
Minimum0
Maximum15
Zeros6
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size7.2 MiB
2022-12-07T19:46:37.125391image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q18
median14
Q314
95-th percentile14
Maximum15
Range15
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.3981675
Coefficient of variation (CV)0.41473724
Kurtosis-0.80114827
Mean10.604708
Median Absolute Deviation (MAD)0
Skewness-0.82937242
Sum4985443
Variance19.343877
MonotonicityNot monotonic
2022-12-07T19:46:37.266020image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
14 262971
55.9%
8 78653
 
16.7%
5 47248
 
10.1%
2 26381
 
5.6%
13 17955
 
3.8%
1 15734
 
3.3%
7 7863
 
1.7%
3 4280
 
0.9%
4 3063
 
0.7%
9 2237
 
0.5%
Other values (6) 3731
 
0.8%
ValueCountFrequency (%)
0 6
 
< 0.1%
1 15734
 
3.3%
2 26381
 
5.6%
3 4280
 
0.9%
4 3063
 
0.7%
5 47248
10.1%
6 1600
 
0.3%
7 7863
 
1.7%
8 78653
16.7%
9 2237
 
0.5%
ValueCountFrequency (%)
15 568
 
0.1%
14 262971
55.9%
13 17955
 
3.8%
12 1032
 
0.2%
11 454
 
0.1%
10 71
 
< 0.1%
9 2237
 
0.5%
8 78653
 
16.7%
7 7863
 
1.7%
6 1600
 
0.3%

OWNER_DESCR
Categorical

Distinct16
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.2 MiB
MISSING/NOT SPECIFIED
262971 
PRIVATE
78653 
USFS
47248 
BIA
26381 
STATE OR PRIVATE
 
17955
Other values (11)
36908 

Length

Max length21
Median length21
Mean length14.462941
Min length3

Characters and Unicode

Total characters6799260
Distinct characters23
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMISSING/NOT SPECIFIED
2nd rowMISSING/NOT SPECIFIED
3rd rowMISSING/NOT SPECIFIED
4th rowUSFS
5th rowMISSING/NOT SPECIFIED

Common Values

ValueCountFrequency (%)
MISSING/NOT SPECIFIED 262971
55.9%
PRIVATE 78653
 
16.7%
USFS 47248
 
10.1%
BIA 26381
 
5.6%
STATE OR PRIVATE 17955
 
3.8%
BLM 15734
 
3.3%
STATE 7863
 
1.7%
NPS 4280
 
0.9%
FWS 3063
 
0.7%
TRIBAL 2237
 
0.5%
Other values (6) 3731
 
0.8%

Length

2022-12-07T19:46:37.437884image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
missing/not 262971
34.1%
specified 262971
34.1%
private 96608
 
12.5%
usfs 47248
 
6.1%
bia 26381
 
3.4%
state 25818
 
3.3%
or 17955
 
2.3%
blm 15734
 
2.0%
nps 4280
 
0.6%
fws 3063
 
0.4%
Other values (8) 8136
 
1.1%

Most occurring characters

ValueCountFrequency (%)
I 1179748
17.4%
S 916570
13.5%
E 655446
9.6%
N 532850
 
7.8%
T 415506
 
6.1%
P 364891
 
5.4%
F 316024
 
4.6%
301049
 
4.4%
O 284089
 
4.2%
M 279737
 
4.1%
Other values (13) 1553350
22.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 6234208
91.7%
Space Separator 301049
 
4.4%
Other Punctuation 264003
 
3.9%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 1179748
18.9%
S 916570
14.7%
E 655446
10.5%
N 532850
8.5%
T 415506
 
6.7%
P 364891
 
5.9%
F 316024
 
5.1%
O 284089
 
4.6%
M 279737
 
4.5%
D 266275
 
4.3%
Other values (11) 1023072
16.4%
Space Separator
ValueCountFrequency (%)
301049
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 264003
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6234208
91.7%
Common 565052
 
8.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 1179748
18.9%
S 916570
14.7%
E 655446
10.5%
N 532850
8.5%
T 415506
 
6.7%
P 364891
 
5.9%
F 316024
 
5.1%
O 284089
 
4.6%
M 279737
 
4.5%
D 266275
 
4.3%
Other values (11) 1023072
16.4%
Common
ValueCountFrequency (%)
301049
53.3%
/ 264003
46.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6799260
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 1179748
17.4%
S 916570
13.5%
E 655446
9.6%
N 532850
 
7.8%
T 415506
 
6.1%
P 364891
 
5.4%
F 316024
 
4.6%
301049
 
4.4%
O 284089
 
4.2%
M 279737
 
4.1%
Other values (13) 1553350
22.8%

STATE
Categorical

HIGH CARDINALITY
HIGH CORRELATION

Distinct52
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.2 MiB
CA
47404 
GA
42072 
TX
35661 
NC
 
27728
FL
 
22596
Other values (47)
294655 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters940232
Distinct characters24
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowLA
2nd rowNC
3rd rowUT
4th rowMT
5th rowCA

Common Values

ValueCountFrequency (%)
CA 47404
 
10.1%
GA 42072
 
8.9%
TX 35661
 
7.6%
NC 27728
 
5.9%
FL 22596
 
4.8%
NY 20148
 
4.3%
SC 20106
 
4.3%
MS 19670
 
4.2%
AZ 17854
 
3.8%
AL 16854
 
3.6%
Other values (42) 200023
42.5%

Length

2022-12-07T19:46:37.578510image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ca 47404
 
10.1%
ga 42072
 
8.9%
tx 35661
 
7.6%
nc 27728
 
5.9%
fl 22596
 
4.8%
ny 20148
 
4.3%
sc 20106
 
4.3%
ms 19670
 
4.2%
az 17854
 
3.8%
al 16854
 
3.6%
Other values (42) 200023
42.5%

Most occurring characters

ValueCountFrequency (%)
A 160433
17.1%
C 104920
11.2%
N 93804
 
10.0%
T 62580
 
6.7%
M 62576
 
6.7%
S 49451
 
5.3%
L 47350
 
5.0%
G 42072
 
4.5%
O 40124
 
4.3%
X 35661
 
3.8%
Other values (14) 241261
25.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 940232
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 160433
17.1%
C 104920
11.2%
N 93804
 
10.0%
T 62580
 
6.7%
M 62576
 
6.7%
S 49451
 
5.3%
L 47350
 
5.0%
G 42072
 
4.5%
O 40124
 
4.3%
X 35661
 
3.8%
Other values (14) 241261
25.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 940232
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 160433
17.1%
C 104920
11.2%
N 93804
 
10.0%
T 62580
 
6.7%
M 62576
 
6.7%
S 49451
 
5.3%
L 47350
 
5.0%
G 42072
 
4.5%
O 40124
 
4.3%
X 35661
 
3.8%
Other values (14) 241261
25.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 940232
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 160433
17.1%
C 104920
11.2%
N 93804
 
10.0%
T 62580
 
6.7%
M 62576
 
6.7%
S 49451
 
5.3%
L 47350
 
5.0%
G 42072
 
4.5%
O 40124
 
4.3%
X 35661
 
3.8%
Other values (14) 241261
25.7%

COUNTY
Categorical

HIGH CARDINALITY
MISSING

Distinct3006
Distinct (%)1.0%
Missing169082
Missing (%)36.0%
Memory size7.2 MiB
5
 
1911
SUFFOLK
 
1858
Lincoln
 
1820
Oahu
 
1761
Cherokee
 
1739
Other values (3001)
291945 

Length

Max length50
Median length19
Mean length7.1243946
Min length1

Characters and Unicode

Total characters2144685
Distinct characters72
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique305 ?
Unique (%)0.1%

Sample

1st rowSEVIER
2nd row53
3rd rowWashburn
4th row69
5th rowWASHINGTON

Common Values

ValueCountFrequency (%)
5 1911
 
0.4%
SUFFOLK 1858
 
0.4%
Lincoln 1820
 
0.4%
Oahu 1761
 
0.4%
Cherokee 1739
 
0.4%
Washington 1734
 
0.4%
Polk 1724
 
0.4%
Marion 1710
 
0.4%
Jackson 1595
 
0.3%
Lee 1443
 
0.3%
Other values (2996) 283739
60.4%
(Missing) 169082
36.0%

Length

2022-12-07T19:46:37.750395image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
county 5852
 
1.8%
san 2624
 
0.8%
washington 2552
 
0.8%
st 2509
 
0.8%
lincoln 2082
 
0.6%
jefferson 2071
 
0.6%
marion 1997
 
0.6%
cherokee 1972
 
0.6%
polk 1925
 
0.6%
suffolk 1925
 
0.6%
Other values (1998) 299650
92.2%

Most occurring characters

ValueCountFrequency (%)
231904
 
10.8%
a 154919
 
7.2%
e 146258
 
6.8%
n 124828
 
5.8%
o 124453
 
5.8%
r 104332
 
4.9%
l 87782
 
4.1%
i 81533
 
3.8%
t 72923
 
3.4%
s 68296
 
3.2%
Other values (62) 947457
44.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1297009
60.5%
Uppercase Letter 534034
24.9%
Space Separator 231904
 
10.8%
Decimal Number 79808
 
3.7%
Other Punctuation 1787
 
0.1%
Dash Punctuation 132
 
< 0.1%
Connector Punctuation 7
 
< 0.1%
Open Punctuation 2
 
< 0.1%
Close Punctuation 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 154919
11.9%
e 146258
11.3%
n 124828
9.6%
o 124453
9.6%
r 104332
 
8.0%
l 87782
 
6.8%
i 81533
 
6.3%
t 72923
 
5.6%
s 68296
 
5.3%
u 47558
 
3.7%
Other values (16) 284127
21.9%
Uppercase Letter
ValueCountFrequency (%)
C 46387
 
8.7%
S 40133
 
7.5%
A 38883
 
7.3%
E 37422
 
7.0%
L 33291
 
6.2%
O 31975
 
6.0%
R 31781
 
6.0%
N 31177
 
5.8%
M 29891
 
5.6%
B 23361
 
4.4%
Other values (16) 189733
35.5%
Decimal Number
ValueCountFrequency (%)
1 13561
17.0%
3 12249
15.3%
0 10259
12.9%
5 8608
10.8%
9 8255
10.3%
7 7563
9.5%
2 6976
8.7%
4 4970
 
6.2%
6 3834
 
4.8%
8 3533
 
4.4%
Other Punctuation
ValueCountFrequency (%)
. 1626
91.0%
& 82
 
4.6%
' 68
 
3.8%
, 6
 
0.3%
/ 5
 
0.3%
Space Separator
ValueCountFrequency (%)
231904
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 132
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 7
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1831043
85.4%
Common 313642
 
14.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 154919
 
8.5%
e 146258
 
8.0%
n 124828
 
6.8%
o 124453
 
6.8%
r 104332
 
5.7%
l 87782
 
4.8%
i 81533
 
4.5%
t 72923
 
4.0%
s 68296
 
3.7%
u 47558
 
2.6%
Other values (42) 818161
44.7%
Common
ValueCountFrequency (%)
231904
73.9%
1 13561
 
4.3%
3 12249
 
3.9%
0 10259
 
3.3%
5 8608
 
2.7%
9 8255
 
2.6%
7 7563
 
2.4%
2 6976
 
2.2%
4 4970
 
1.6%
6 3834
 
1.2%
Other values (10) 5463
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2144685
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
231904
 
10.8%
a 154919
 
7.2%
e 146258
 
6.8%
n 124828
 
5.8%
o 124453
 
5.8%
r 104332
 
4.9%
l 87782
 
4.1%
i 81533
 
3.8%
t 72923
 
3.4%
s 68296
 
3.2%
Other values (62) 947457
44.2%

FIPS_CODE
Real number (ℝ)

HIGH CORRELATION
MISSING

Distinct279
Distinct (%)0.1%
Missing169082
Missing (%)36.0%
Infinite0
Infinite (%)0.0%
Mean95.628221
Minimum1
Maximum810
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.2 MiB
2022-12-07T19:46:37.922278image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5
Q129
median67
Q3121
95-th percentile313
Maximum810
Range809
Interquartile range (IQR)92

Descriptive statistics

Standard deviation98.571499
Coefficient of variation (CV)1.0307783
Kurtosis3.911366
Mean95.628221
Median Absolute Deviation (MAD)42
Skewness1.927033
Sum28787346
Variance9716.3405
MonotonicityNot monotonic
2022-12-07T19:46:38.125391image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3 7295
 
1.6%
5 7276
 
1.5%
29 7066
 
1.5%
1 6702
 
1.4%
7 5681
 
1.2%
19 5545
 
1.2%
17 5383
 
1.1%
15 5310
 
1.1%
35 5232
 
1.1%
21 4890
 
1.0%
Other values (269) 240654
51.2%
(Missing) 169082
36.0%
ValueCountFrequency (%)
1 6702
1.4%
3 7295
1.6%
5 7276
1.5%
6 100
 
< 0.1%
7 5681
1.2%
9 4123
0.9%
11 3419
0.7%
12 36
 
< 0.1%
13 4451
0.9%
15 5310
1.1%
ValueCountFrequency (%)
810 3
 
< 0.1%
800 14
 
< 0.1%
760 1
 
< 0.1%
730 1
 
< 0.1%
700 2
 
< 0.1%
550 3
 
< 0.1%
530 1
 
< 0.1%
510 33
< 0.1%
507 35
< 0.1%
505 46
< 0.1%

FIPS_NAME
Categorical

HIGH CARDINALITY
MISSING

Distinct1637
Distinct (%)0.5%
Missing169082
Missing (%)36.0%
Memory size7.2 MiB
Washington
 
2818
Lincoln
 
2619
Jackson
 
2498
Marion
 
2271
Cherokee
 
2179
Other values (1632)
288649 

Length

Max length31
Median length17
Mean length6.9973491
Min length3

Characters and Unicode

Total characters2106440
Distinct characters57
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique57 ?
Unique (%)< 0.1%

Sample

1st rowSevier
2nd rowLincoln
3rd rowWashburn
4th rowIosco
5th rowWashington

Common Values

ValueCountFrequency (%)
Washington 2818
 
0.6%
Lincoln 2619
 
0.6%
Jackson 2498
 
0.5%
Marion 2271
 
0.5%
Cherokee 2179
 
0.5%
Polk 2055
 
0.4%
Monroe 2011
 
0.4%
Coconino 1962
 
0.4%
Suffolk 1911
 
0.4%
Jefferson 1894
 
0.4%
Other values (1627) 278816
59.3%
(Missing) 169082
36.0%

Length

2022-12-07T19:46:38.344289image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
san 3224
 
1.0%
washington 2818
 
0.9%
st 2718
 
0.8%
lincoln 2619
 
0.8%
jackson 2498
 
0.8%
jefferson 2363
 
0.7%
marion 2271
 
0.7%
cherokee 2179
 
0.7%
polk 2055
 
0.6%
monroe 2011
 
0.6%
Other values (1663) 296749
92.3%

Most occurring characters

ValueCountFrequency (%)
a 212233
 
10.1%
e 203678
 
9.7%
n 170211
 
8.1%
o 170102
 
8.1%
r 141688
 
6.7%
l 121861
 
5.8%
i 110464
 
5.2%
s 95785
 
4.5%
t 89867
 
4.3%
u 60643
 
2.9%
Other values (47) 729908
34.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1758150
83.5%
Uppercase Letter 324573
 
15.4%
Space Separator 20471
 
1.0%
Other Punctuation 2818
 
0.1%
Dash Punctuation 426
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 212233
12.1%
e 203678
11.6%
n 170211
9.7%
o 170102
9.7%
r 141688
 
8.1%
l 121861
 
6.9%
i 110464
 
6.3%
s 95785
 
5.4%
t 89867
 
5.1%
u 60643
 
3.4%
Other values (16) 381618
21.7%
Uppercase Letter
ValueCountFrequency (%)
C 39572
 
12.2%
S 28478
 
8.8%
M 28386
 
8.7%
L 24009
 
7.4%
B 22552
 
6.9%
W 19214
 
5.9%
H 18166
 
5.6%
P 17624
 
5.4%
R 13613
 
4.2%
D 13490
 
4.2%
Other values (15) 99469
30.6%
Other Punctuation
ValueCountFrequency (%)
. 2720
96.5%
' 98
 
3.5%
Space Separator
ValueCountFrequency (%)
20471
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 426
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2082723
98.9%
Common 23717
 
1.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 212233
 
10.2%
e 203678
 
9.8%
n 170211
 
8.2%
o 170102
 
8.2%
r 141688
 
6.8%
l 121861
 
5.9%
i 110464
 
5.3%
s 95785
 
4.6%
t 89867
 
4.3%
u 60643
 
2.9%
Other values (41) 706191
33.9%
Common
ValueCountFrequency (%)
20471
86.3%
. 2720
 
11.5%
- 426
 
1.8%
' 98
 
0.4%
( 1
 
< 0.1%
) 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2106440
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 212233
 
10.1%
e 203678
 
9.7%
n 170211
 
8.1%
o 170102
 
8.1%
r 141688
 
6.7%
l 121861
 
5.8%
i 110464
 
5.2%
s 95785
 
4.5%
t 89867
 
4.3%
u 60643
 
2.9%
Other values (47) 729908
34.7%

Shape
Categorical

HIGH CARDINALITY
UNIFORM

Distinct430309
Distinct (%)91.5%
Missing0
Missing (%)0.0%
Memory size7.2 MiB
b'\x00\x01\xad\x10\x00\x00\xb0\xd19?\xc5\x8fP\xc0 ~p>u\xf81@\xb0\xd19?\xc5\x8fP\xc0 ~p>u\xf81@|\x01\x00\x00\x00\xb0\xd19?\xc5\x8fP\xc0 ~p>u\xf81@\xfe'
 
151
b'\x00\x01\xad\x10\x00\x00(\x87\x16\xd9\xce\xeb^\xc0\x98\x97n\x12\x83\x88D@(\x87\x16\xd9\xce\xeb^\xc0\x98\x97n\x12\x83\x88D@|\x01\x00\x00\x00(\x87\x16\xd9\xce\xeb^\xc0\x98\x97n\x12\x83\x88D@\xfe'
 
91
b'\x00\x01\xad\x10\x00\x000>\xcc^\xb6\x98P\xc0Ps\x9dFZ\xfe1@0>\xcc^\xb6\x98P\xc0Ps\x9dFZ\xfe1@|\x01\x00\x00\x000>\xcc^\xb6\x98P\xc0Ps\x9dFZ\xfe1@\xfe'
 
90
b'\x00\x01\xad\x10\x00\x00d\xc4\x05\xa0Q\x87P\xc0@\xb4s\x9a\x05\xfa1@d\xc4\x05\xa0Q\x87P\xc0@\xb4s\x9a\x05\xfa1@|\x01\x00\x00\x00d\xc4\x05\xa0Q\x87P\xc0@\xb4s\x9a\x05\xfa1@\xfe'
 
85
b'\x00\x01\xad\x10\x00\x00$~\x8c\xb9k5]\xc0\xb8Y\xf5\xb9\xdaJ@@$~\x8c\xb9k5]\xc0\xb8Y\xf5\xb9\xdaJ@@|\x01\x00\x00\x00$~\x8c\xb9k5]\xc0\xb8Y\xf5\xb9\xdaJ@@\xfe'
 
81
Other values (430304)
469618 

Length

Max length216
Median length207
Mean length172.42236
Min length105

Characters and Unicode

Total characters81058509
Distinct characters95
Distinct categories12 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique408593 ?
Unique (%)86.9%

Sample

1st rowb'\x00\x01\xad\x10\x00\x00\x00M\x84\rOKW\xc0\x80\xf9\x0f\xe9\xb7??@\x00M\x84\rOKW\xc0\x80\xf9\x0f\xe9\xb7??@|\x01\x00\x00\x00\x00M\x84\rOKW\xc0\x80\xf9\x0f\xe9\xb7??@\xfe'
2nd rowb"\x00\x01\xad\x10\x00\x00\xd4V\xec/\xbb\xdbS\xc0XR'\xa0\x89hA@\xd4V\xec/\xbb\xdbS\xc0XR'\xa0\x89hA@|\x01\x00\x00\x00\xd4V\xec/\xbb\xdbS\xc0XR'\xa0\x89hA@\xfe"
3rd rowb'\x00\x01\xad\x10\x00\x00d\xfc\xfb\x8c\x0b\x0b\\\xc0\x10\x93\xa9\x82QYC@d\xfc\xfb\x8c\x0b\x0b\\\xc0\x10\x93\xa9\x82QYC@|\x01\x00\x00\x00d\xfc\xfb\x8c\x0b\x0b\\\xc0\x10\x93\xa9\x82QYC@\xfe'
4th rowb'\x00\x01\xad\x10\x00\x00\xacG\xe1z\x14\xfe\\\xc0\xd0\xcf\x14t\xda H@\xacG\xe1z\x14\xfe\\\xc0\xd0\xcf\x14t\xda H@|\x01\x00\x00\x00\xacG\xe1z\x14\xfe\\\xc0\xd0\xcf\x14t\xda H@\xfe'
5th rowb'\x00\x01\xad\x10\x00\x00\xc0\x1a%\xb7?H]\xc0x\x87\x02\x8d\x04\x9c@@\xc0\x1a%\xb7?H]\xc0x\x87\x02\x8d\x04\x9c@@|\x01\x00\x00\x00\xc0\x1a%\xb7?H]\xc0x\x87\x02\x8d\x04\x9c@@\xfe'

Common Values

ValueCountFrequency (%)
b'\x00\x01\xad\x10\x00\x00\xb0\xd19?\xc5\x8fP\xc0 ~p>u\xf81@\xb0\xd19?\xc5\x8fP\xc0 ~p>u\xf81@|\x01\x00\x00\x00\xb0\xd19?\xc5\x8fP\xc0 ~p>u\xf81@\xfe' 151
 
< 0.1%
b'\x00\x01\xad\x10\x00\x00(\x87\x16\xd9\xce\xeb^\xc0\x98\x97n\x12\x83\x88D@(\x87\x16\xd9\xce\xeb^\xc0\x98\x97n\x12\x83\x88D@|\x01\x00\x00\x00(\x87\x16\xd9\xce\xeb^\xc0\x98\x97n\x12\x83\x88D@\xfe' 91
 
< 0.1%
b'\x00\x01\xad\x10\x00\x000>\xcc^\xb6\x98P\xc0Ps\x9dFZ\xfe1@0>\xcc^\xb6\x98P\xc0Ps\x9dFZ\xfe1@|\x01\x00\x00\x000>\xcc^\xb6\x98P\xc0Ps\x9dFZ\xfe1@\xfe' 90
 
< 0.1%
b'\x00\x01\xad\x10\x00\x00d\xc4\x05\xa0Q\x87P\xc0@\xb4s\x9a\x05\xfa1@d\xc4\x05\xa0Q\x87P\xc0@\xb4s\x9a\x05\xfa1@|\x01\x00\x00\x00d\xc4\x05\xa0Q\x87P\xc0@\xb4s\x9a\x05\xfa1@\xfe' 85
 
< 0.1%
b'\x00\x01\xad\x10\x00\x00$~\x8c\xb9k5]\xc0\xb8Y\xf5\xb9\xdaJ@@$~\x8c\xb9k5]\xc0\xb8Y\xf5\xb9\xdaJ@@|\x01\x00\x00\x00$~\x8c\xb9k5]\xc0\xb8Y\xf5\xb9\xdaJ@@\xfe' 81
 
< 0.1%
b'\x00\x01\xad\x10\x00\x00\x14\xc3\xd5\x01\x10\x8bP\xc0\x00\xaa\xd5WW\xf51@\x14\xc3\xd5\x01\x10\x8bP\xc0\x00\xaa\xd5WW\xf51@|\x01\x00\x00\x00\x14\xc3\xd5\x01\x10\x8bP\xc0\x00\xaa\xd5WW\xf51@\xfe' 77
 
< 0.1%
b'\x00\x01\xad\x10\x00\x00L\x15\x8cJ\xea\x9c[\xc0\xb0,C\x1c\xeb\xaa@@L\x15\x8cJ\xea\x9c[\xc0\xb0,C\x1c\xeb\xaa@@|\x01\x00\x00\x00L\x15\x8cJ\xea\x9c[\xc0\xb0,C\x1c\xeb\xaa@@\xfe' 70
 
< 0.1%
b'\x00\x01\xad\x10\x00\x00\x14\xc7\xdc\x10>CR\xc0\xa0\x12I\xf42\xbeD@\x14\xc7\xdc\x10>CR\xc0\xa0\x12I\xf42\xbeD@|\x01\x00\x00\x00\x14\xc7\xdc\x10>CR\xc0\xa0\x12I\xf42\xbeD@\xfe' 66
 
< 0.1%
b'\x00\x01\xad\x10\x00\x00P"\x89^F\x90P\xc0 Z\xd6\xfdc\t2@P"\x89^F\x90P\xc0 Z\xd6\xfdc\t2@|\x01\x00\x00\x00P"\x89^F\x90P\xc0 Z\xd6\xfdc\t2@\xfe' 66
 
< 0.1%
b'\x00\x01\xad\x10\x00\x00\x08uX\xe1\x96\x97P\xc0P\xb3\xeb\xde\x8a\xf81@\x08uX\xe1\x96\x97P\xc0P\xb3\xeb\xde\x8a\xf81@|\x01\x00\x00\x00\x08uX\xe1\x96\x97P\xc0P\xb3\xeb\xde\x8a\xf81@\xfe' 60
 
< 0.1%
Other values (430299) 469279
99.8%

Length

2022-12-07T19:46:38.594294image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
b'\x00\x01\xad\x10\x00\x00 5543
 
1.0%
xfe 602
 
0.1%
a@\xfe 399
 
0.1%
b'\x00\x01\xad\x10\x00\x00x 356
 
0.1%
xc0 348
 
0.1%
b'\x00\x01\xad\x10\x00\x004 329
 
0.1%
b@\xfe 272
 
< 0.1%
b'\x00\x01\xad\x10\x00\x00\xb0 192
 
< 0.1%
b"\x00\x01\xad\x10\x00\x00 181
 
< 0.1%
c@\xfe 179
 
< 0.1%
Other values (511481) 573390
98.6%

Most occurring characters

ValueCountFrequency (%)
\ 17564852
21.7%
x 17142455
21.1%
0 10778844
13.3%
c 3401265
 
4.2%
1 3056754
 
3.8%
f 2303929
 
2.8%
8 2296344
 
2.8%
e 2164582
 
2.7%
b 2118034
 
2.6%
a 2053369
 
2.5%
Other values (85) 18178081
22.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 32599417
40.2%
Decimal Number 21657750
26.7%
Other Punctuation 21102454
26.0%
Uppercase Letter 3576624
 
4.4%
Math Symbol 969544
 
1.2%
Open Punctuation 319761
 
0.4%
Modifier Symbol 294672
 
0.4%
Close Punctuation 242490
 
0.3%
Space Separator 111834
 
0.1%
Currency Symbol 67965
 
0.1%
Other values (2) 115998
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
x 17142455
52.6%
c 3401265
 
10.4%
f 2303929
 
7.1%
e 2164582
 
6.6%
b 2118034
 
6.5%
a 2053369
 
6.3%
d 2018152
 
6.2%
p 161829
 
0.5%
n 126075
 
0.4%
t 118827
 
0.4%
Other values (16) 990900
 
3.0%
Uppercase Letter
ValueCountFrequency (%)
T 320052
 
8.9%
A 292749
 
8.2%
D 196005
 
5.5%
W 195123
 
5.5%
X 193629
 
5.4%
U 192774
 
5.4%
B 191460
 
5.4%
V 169398
 
4.7%
C 157977
 
4.4%
G 153576
 
4.3%
Other values (16) 1513881
42.3%
Other Punctuation
ValueCountFrequency (%)
\ 17564852
83.2%
@ 1786350
 
8.5%
' 958566
 
4.5%
? 153675
 
0.7%
" 94115
 
0.4%
, 71076
 
0.3%
: 64914
 
0.3%
/ 57414
 
0.3%
; 54819
 
0.3%
& 52779
 
0.3%
Other values (5) 243894
 
1.2%
Decimal Number
ValueCountFrequency (%)
0 10778844
49.8%
1 3056754
 
14.1%
8 2296344
 
10.6%
9 1649055
 
7.6%
4 827934
 
3.8%
3 661287
 
3.1%
7 632007
 
2.9%
5 611571
 
2.8%
2 595701
 
2.8%
6 548253
 
2.5%
Math Symbol
ValueCountFrequency (%)
| 535345
55.2%
> 128853
 
13.3%
= 111423
 
11.5%
< 86079
 
8.9%
~ 58173
 
6.0%
+ 49671
 
5.1%
Close Punctuation
ValueCountFrequency (%)
] 140937
58.1%
} 51258
 
21.1%
) 50295
 
20.7%
Open Punctuation
ValueCountFrequency (%)
( 140838
44.0%
[ 131142
41.0%
{ 47781
 
14.9%
Modifier Symbol
ValueCountFrequency (%)
^ 177519
60.2%
` 117153
39.8%
Space Separator
ValueCountFrequency (%)
111834
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 67965
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 67716
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 48282
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 44882468
55.4%
Latin 36176041
44.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
x 17142455
47.4%
c 3401265
 
9.4%
f 2303929
 
6.4%
e 2164582
 
6.0%
b 2118034
 
5.9%
a 2053369
 
5.7%
d 2018152
 
5.6%
T 320052
 
0.9%
A 292749
 
0.8%
D 196005
 
0.5%
Other values (42) 4165449
 
11.5%
Common
ValueCountFrequency (%)
\ 17564852
39.1%
0 10778844
24.0%
1 3056754
 
6.8%
8 2296344
 
5.1%
@ 1786350
 
4.0%
9 1649055
 
3.7%
' 958566
 
2.1%
4 827934
 
1.8%
3 661287
 
1.5%
7 632007
 
1.4%
Other values (33) 4670475
 
10.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 81058509
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
\ 17564852
21.7%
x 17142455
21.1%
0 10778844
13.3%
c 3401265
 
4.2%
1 3056754
 
3.8%
f 2303929
 
2.8%
8 2296344
 
2.8%
e 2164582
 
2.7%
b 2118034
 
2.6%
a 2053369
 
2.5%
Other values (85) 18178081
22.4%

Interactions

2022-12-07T19:46:12.436419image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:44:58.037727image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:03.578400image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:08.879723image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:14.111039image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:19.230397image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:24.296749image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:29.618975image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:33.742833image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:39.602960image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:43.802750image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:47.969234image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:51.930073image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:57.146545image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:02.180385image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:07.447806image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:12.735574image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:44:58.460270image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:03.945152image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:09.214167image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:14.478008image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:19.594355image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:24.662101image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:29.869353image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:34.191485image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:39.871559image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:44.069491image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:48.221272image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:52.266846image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:57.497783image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:02.530472image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:07.799228image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:13.053105image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:44:58.810991image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:04.311730image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:09.577061image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:14.812387image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:19.943485image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:25.012596image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:30.136118image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:34.660937image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:40.324645image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:44.336572image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:48.470644image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:52.600306image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:57.849564image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:02.898241image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:08.166059image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:13.337131image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:44:59.176872image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:04.663126image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:09.928809image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:15.163046image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:20.295023image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:25.538637image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:30.385549image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:35.156842image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:40.584653image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:44.603289image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:48.737039image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:52.931242image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:58.197553image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:03.232272image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:08.529876image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:13.604549image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:44:59.545041image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:05.025851image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:10.262846image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:15.512592image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:20.629565image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:25.878399image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:30.635119image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:35.672431image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:40.821584image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:44.852677image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:48.973046image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:53.265990image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:58.530887image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:03.566960image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:08.865902image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:13.916984image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:44:59.893493image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:05.374217image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:10.596539image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:15.846348image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:20.977606image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:26.243846image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:30.869569image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:36.094157image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:41.084533image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:45.118677image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:49.220029image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:53.582321image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:58.865237image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:04.079219image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:09.199880image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:14.140938image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:00.166779image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:05.648642image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:10.851879image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:16.114763image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:21.234096image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:26.518724image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:31.135055image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:36.466245image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:41.336136image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:45.370583image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:49.469477image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:53.838642image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:59.119794image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:04.338446image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:09.455266image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:14.421342image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:00.527738image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:06.010228image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:11.183067image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:16.464096image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:21.563229image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:26.862942image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:31.385725image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:36.956527image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:41.584681image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:45.637055image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:49.704753image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:54.182626image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:59.464386image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:04.696506image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:09.783321image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:14.639635image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:00.783727image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:06.268204image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:11.451222image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:16.703357image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:21.816577image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:27.136021image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:31.620232image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:37.236006image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:41.836077image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:45.886090image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:49.952542image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:54.420991image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:59.719986image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:04.954741image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:10.052930image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:14.905260image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:01.050939image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:06.533951image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:11.700804image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:16.967884image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:22.068525image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:27.384966image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:31.868319image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:37.487936image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:42.086313image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:46.146266image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:50.186692image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:54.669709image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:59.971118image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:05.189021image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:10.304951image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:15.157822image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:01.318901image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:06.785333image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:11.950570image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:17.221029image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:22.318916image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:27.636606image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:32.103796image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:37.719928image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:42.322717image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:46.403334image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:50.437684image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:54.921245image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:00.222271image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:05.438966image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:10.538390image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:15.453110image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:01.662219image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:07.143787image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:12.278343image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:17.546429image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:22.645776image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:27.979248image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:32.360387image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:38.047720image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:42.552249image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:46.687566image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:50.658396image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:55.233864image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:00.549516image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:05.765386image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:10.882233image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:15.738317image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:02.026846image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:07.510297image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:12.767876image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:17.926171image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:22.993577image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:28.330009image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:32.609330image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:38.380976image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:42.819394image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:46.952543image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:50.904680image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:55.798789image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:00.882639image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:06.130877image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:11.216034image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:16.018913image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:02.378232image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:07.864081image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:13.111376image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:18.261279image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:23.329329image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:28.680335image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:32.856162image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:38.714263image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:43.068096image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:47.208516image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:51.138536image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:56.132953image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:01.215084image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:06.479621image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:11.549363image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:16.302220image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:02.915600image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:08.229841image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:13.461654image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:18.611420image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:23.661428image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:29.044217image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:33.119163image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:39.063530image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:43.320196image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:47.455025image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:51.370847image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:56.481846image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:01.549199image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:06.814156image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:11.883354image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:16.570148image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:03.228899image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:08.515774image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:13.749730image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:18.884590image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:23.948869image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:29.348896image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:33.337154image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:39.335695image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:43.537697image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:47.703766image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:51.589986image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:45:56.753611image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:01.834609image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:07.100792image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-12-07T19:46:12.169759image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2022-12-07T19:46:39.000538image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Auto

The auto setting is an interpretable pairwise column metric of the following mapping:
  • Variable_type-Variable_type : Method, Range
  • Categorical-Categorical : Cramer's V, [0,1]
  • Numerical-Categorical : Cramer's V, [0,1] (using a discretized numerical column)
  • Numerical-Numerical : Spearman's ρ, [-1,1]
The number of bins used in the discretization for the Numerical-Categorical column pair can be changed using config.correlations["auto"].n_bins. The number of bins affects the granularity of the association you wish to measure.

This configuration uses the recommended metric for each pair of columns.
2022-12-07T19:46:39.813038image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-12-07T19:46:40.391165image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-12-07T19:46:41.031792image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-12-07T19:46:41.553643image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-12-07T19:46:41.959896image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-12-07T19:46:17.838804image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-12-07T19:46:20.390234image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-12-07T19:46:25.436205image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Unnamed: 0OBJECTIDFOD_IDFPA_IDSOURCE_SYSTEM_TYPESOURCE_SYSTEMNWCG_REPORTING_AGENCYNWCG_REPORTING_UNIT_IDNWCG_REPORTING_UNIT_NAMESOURCE_REPORTING_UNITSOURCE_REPORTING_UNIT_NAMELOCAL_FIRE_REPORT_IDLOCAL_INCIDENT_IDFIRE_CODEFIRE_NAMEICS_209_INCIDENT_NUMBERICS_209_NAMEMTBS_IDMTBS_FIRE_NAMECOMPLEX_NAMEFIRE_YEARDISCOVERY_DATEDISCOVERY_DOYDISCOVERY_TIMESTAT_CAUSE_CODESTAT_CAUSE_DESCRCONT_DATECONT_DOYCONT_TIMEFIRE_SIZEFIRE_SIZE_CLASSLATITUDELONGITUDEOWNER_CODEOWNER_DESCRSTATECOUNTYFIPS_CODEFIPS_NAMEShape
1381838944218944221020177SWRA_LA_22416NONFEDST-LALASST/C&LUSLALASLouisiana Office of ForestryLALAS7LAS District 7NaNLA7-U2NaNNaNNaNNaNNaNNaNNaN20002451585.542NaN9.0MiscellaneousNaNNaNNaN1.00B31.248900-93.17670014.0MISSING/NOT SPECIFIEDLANaNNaNNaNb'\x00\x01\xad\x10\x00\x00\x00M\x84\rOKW\xc0\x80\xf9\x0f\xe9\xb7??@\x00M\x84\rOKW\xc0\x80\xf9\x0f\xe9\xb7??@|\x01\x00\x00\x00\x00M\x84\rOKW\xc0\x80\xf9\x0f\xe9\xb7??@\xfe'
519285104184210418431171683TFS_NC_220729NONFEDST-NCNCSST/C&LUSNCNCSNorth Carolina Forest ServiceNCNCS203NCS Region 2 District 3NaN01-083NaNMESSY PLACENaNNaNNaNNaNNaN20012451946.537NaN9.0MiscellaneousNaNNaNNaN0.50B34.816700-79.43330014.0MISSING/NOT SPECIFIEDNCNaNNaNNaNb"\x00\x01\xad\x10\x00\x00\xd4V\xec/\xbb\xdbS\xc0XR'\xa0\x89hA@\xd4V\xec/\xbb\xdbS\xc0XR'\xa0\x89hA@|\x01\x00\x00\x00\xd4V\xec/\xbb\xdbS\xc0XR'\xa0\x89hA@\xfe"
116246755583755584856395UT_2828-2000NONFEDST-UTUTSST/C&LUSUTUTSUtah Division Forestry Fire State LandsUTUTSUtah Division Forestry Fire State LandsNaN2828-2000NaNELSINORE MTN (ADAM)NaNNaNNaNNaNNaN20002451745.5202NaN13.0Missing/Undefined2451760.5217.0NaN5.00B38.697800-112.17258014.0MISSING/NOT SPECIFIEDUTSEVIER41.0Sevierb'\x00\x01\xad\x10\x00\x00d\xfc\xfb\x8c\x0b\x0b\\\xc0\x10\x93\xa9\x82QYC@d\xfc\xfb\x8c\x0b\x0b\\\xc0\x10\x93\xa9\x82QYC@|\x01\x00\x00\x00d\xfc\xfb\x8c\x0b\x0b\\\xc0\x10\x93\xa9\x82QYC@\xfe'
179264105381053910587FS-1438071FEDFS-FIRESTATFSUSMTKNFKootenai National Forest0114Kootenai National Forest20347B6TBWHOOPEE CREEKNaNNaNNaNNaNNaN20062453965.52302200.01.0Lightning2453966.5231.02000.00.25A48.256667-115.9700005.0USFSMT5353.0Lincolnb'\x00\x01\xad\x10\x00\x00\xacG\xe1z\x14\xfe\\\xc0\xd0\xcf\x14t\xda H@\xacG\xe1z\x14\xfe\\\xc0\xd0\xcf\x14t\xda H@|\x01\x00\x00\x00\xacG\xe1z\x14\xfe\\\xc0\xd0\xcf\x14t\xda H@\xfe'
517587113169811316991379717CDF_1997_55_2223_882NONFEDST-CACDFST/C&LUSCAMVUMonte Vista UnitCAMVUCDF - Monte Vista UnitNaN882NaNI-15 NR18NaNNaNNaNNaNNaN19972450651.5203NaN2.0Equipment UseNaNNaNNaN1.00B33.218889-117.12888914.0MISSING/NOT SPECIFIEDCANaNNaNNaNb'\x00\x01\xad\x10\x00\x00\xc0\x1a%\xb7?H]\xc0x\x87\x02\x8d\x04\x9c@@\xc0\x1a%\xb7?H]\xc0x\x87\x02\x8d\x04\x9c@@|\x01\x00\x00\x00\xc0\x1a%\xb7?H]\xc0x\x87\x02\x8d\x04\x9c@@\xfe'
3658039337529337531061196SWRA_TN_13162NONFEDST-TNTNSST/C&LUSTNTNSTennessee Division of ForestryTNTNS5TNS Unit 5NaN0229NaNNaNNaNNaNNaNNaNNaN20012451994.585NaN5.0Debris BurningNaNNaNNaN1.00B35.001700-86.70330014.0MISSING/NOT SPECIFIEDTNNaNNaNNaNb'\x00\x01\xad\x10\x00\x00\xb4\xd1\x00\xde\x02\xadU\xc0\xa03\xa2\xb47\x80A@\xb4\xd1\x00\xde\x02\xadU\xc0\xa03\xa2\xb47\x80A@|\x01\x00\x00\x00\xb4\xd1\x00\xde\x02\xadU\xc0\xa03\xa2\xb47\x80A@\xfe'
2686791808791809897806SCHMIDT_55017528NONFEDST-WIWISST/C&LUSWIWISWisconsin Department of Natural ResourcesWIWISWisconsin Department of Natural ResourcesNaNNaNNaNNaNNaNNaNNaNNaNNaN19962450242.51601426.09.0Miscellaneous2450242.5160.01428.00.01A45.859385-91.84440614.0MISSING/NOT SPECIFIEDWIWashburn129.0Washburnb'\x00\x01\xad\x10\x00\x00 Z\xe9\xbd\n\xf6V\xc0\xe0c\nR\x00\xeeF@ Z\xe9\xbd\n\xf6V\xc0\xe0c\nR\x00\xeeF@|\x01\x00\x00\x00 Z\xe9\xbd\n\xf6V\xc0\xe0c\nR\x00\xeeF@\xfe'
60950112375211237531369660CDF_1999_54_2220_14012NONFEDST-CACDFST/C&LUSCATGUTehama-Glenn UnitCATGUTehama-Glenn UnitNaN14012NaNFALSE ALARMNaNNaNNaNNaNNaN19992451213.535NaN5.0Debris BurningNaNNaNNaN1.00B40.208889-122.26388914.0MISSING/NOT SPECIFIEDCANaNNaNNaNb'\x00\x01\xad\x10\x00\x000X/\x8e\xe3\x90^\xc0\x98\x0c\xee\xde\xbc\x1aD@0X/\x8e\xe3\x90^\xc0\x98\x0c\xee\xde\xbc\x1aD@|\x01\x00\x00\x000X/\x8e\xe3\x90^\xc0\x98\x0c\xee\xde\xbc\x1aD@\xfe'
495969219916219917223301W-34083FEDDOI-WFMIBLMUSCOCRDNorthwest DistrictCOCRDWestern Slope Center, CraigNaNNaNE156FOUR MILENaNNaNNaNNaNNaN19932449211.52241707.01.0Lightning2449211.5224.01800.00.30B39.950000-108.7173001.0BLMCONaNNaNNaNb'\x00\x01\xad\x10\x00\x00\xecZB>\xe8-[\xc0\xa0\x99\x99\x99\x99\xf9C@\xecZB>\xe8-[\xc0\xa0\x99\x99\x99\x99\xf9C@|\x01\x00\x00\x00\xecZB>\xe8-[\xc0\xa0\x99\x99\x99\x99\xf9C@\xfe'
537270142431142432143823FS-370292FEDFS-FIRESTATFSUSMIHMFHuron-Manistee National Forest0904Huron-Manistee National Forest46NaNNaNWILBURNaNNaNNaNNaNNaN20012452129.52201431.01.0Lightning2452129.5220.01900.01.00B44.000000-83.00000013.0STATE OR PRIVATEMI6969.0Ioscob'\x00\x01\xad\x10\x00\x00\xfc\xff\xff\xff\xff\xbfT\xc0\x08\x00\x00\x00\x00\x00F@\xfc\xff\xff\xff\xff\xbfT\xc0\x08\x00\x00\x00\x00\x00F@|\x01\x00\x00\x00\xfc\xff\xff\xff\xff\xbfT\xc0\x08\x00\x00\x00\x00\x00F@\xfe'
Unnamed: 0OBJECTIDFOD_IDFPA_IDSOURCE_SYSTEM_TYPESOURCE_SYSTEMNWCG_REPORTING_AGENCYNWCG_REPORTING_UNIT_IDNWCG_REPORTING_UNIT_NAMESOURCE_REPORTING_UNITSOURCE_REPORTING_UNIT_NAMELOCAL_FIRE_REPORT_IDLOCAL_INCIDENT_IDFIRE_CODEFIRE_NAMEICS_209_INCIDENT_NUMBERICS_209_NAMEMTBS_IDMTBS_FIRE_NAMECOMPLEX_NAMEFIRE_YEARDISCOVERY_DATEDISCOVERY_DOYDISCOVERY_TIMESTAT_CAUSE_CODESTAT_CAUSE_DESCRCONT_DATECONT_DOYCONT_TIMEFIRE_SIZEFIRE_SIZE_CLASSLATITUDELONGITUDEOWNER_CODEOWNER_DESCRSTATECOUNTYFIPS_CODEFIPS_NAMEShape
413825112537811253791371501CDF_2007_54_2220_004583NONFEDST-CACDFST/C&LUSCATGUTehama-Glenn UnitCATGUTehama-Glenn UnitNaN004583NaNDORA IC/TC FIRENaNNaNNaNNaNNaN20072454282.5182NaN2.0Equipment UseNaNNaNNaN4.00B39.891944-122.16888914.0MISSING/NOT SPECIFIEDCANaNNaNNaNb'\x00\x01\xad\x10\x00\x00\x80\x10N\x13\xcf\x8a^\xc0\x18\xd3C<+\xf2C@\x80\x10N\x13\xcf\x8a^\xc0\x18\xd3C<+\xf2C@|\x01\x00\x00\x00\x80\x10N\x13\xcf\x8a^\xc0\x18\xd3C<+\xf2C@\xfe'
2295201458784145878520025051FS-1499294FEDFS-FIRESTATFSUSFLFNFNational Forests in Florida0805National Forests In Florida15923EK2YAIR STRIPNaNNaNNaNNaNNaN20112455739.51781200.01.0Lightning2455739.5178.01400.00.57B30.246389-84.9983335.0USFSFLNaNNaNNaNb'\x00\x01\xad\x10\x00\x00\xd4\xb6z\xb1\xe4?U\xc0\xe0\xa5\xa0W\x13?>@\xd4\xb6z\xb1\xe4?U\xc0\xe0\xa5\xa0W\x13?>@|\x01\x00\x00\x00\xd4\xb6z\xb1\xe4?U\xc0\xe0\xa5\xa0W\x13?>@\xfe'
21440135104413510451830880SFO-NY-NYS-1998-240NONFEDST-NASFST/C&LUSNYNYSNew York Forest RangersNYNYSNew York Forest RangersNaNNYS-1998-240NaNNaNNaNNaNNaNNaNNaN19982451113.5300NaN7.0ArsonNaNNaNNaN5.00B42.304876-78.25060114.0MISSING/NOT SPECIFIEDNYALLEGANY3.0Alleganyb"\x00\x01\xad\x10\x00\x00\x10\xd6\xc6\xd8\t\x90S\xc0\xf0\xaa@-\x06'E@\x10\xd6\xc6\xd8\t\x90S\xc0\xf0\xaa@-\x06'E@|\x01\x00\x00\x00\x10\xd6\xc6\xd8\t\x90S\xc0\xf0\xaa@-\x06'E@\xfe"
117583479724479725516391SFO-LA0408-86331NONFEDST-NASFST/C&LUSLALASLouisiana Office of ForestryLALA7LAS District 7NaNNaNNaNNaNNaNNaNNaNNaNNaN20082454585.5120NaN7.0ArsonNaNNaNNaN1.50B30.773010-93.54340014.0MISSING/NOT SPECIFIEDLABeauregard11.0Beauregardb'\x00\x01\xad\x10\x00\x00\\)\xcb\x10\xc7bW\xc00{\xbd\xfb\xe3\xc5>@\\)\xcb\x10\xc7bW\xc00{\xbd\xfb\xe3\xc5>@|\x01\x00\x00\x00\\)\xcb\x10\xc7bW\xc00{\xbd\xfb\xe3\xc5>@\xfe'
73349836607836608957898SWRA_AL_31034NONFEDST-ALALSST/C&LUSALALSAlabama Forestry CommissionALALS1AFC District 1NaN85.499NaNNaNNaNNaNNaNNaNNaN19972450496.548NaN7.0ArsonNaNNaNNaN7.00B34.021900-85.50000014.0MISSING/NOT SPECIFIEDALNaNNaNNaNb'\x00\x01\xad\x10\x00\x00\xfc\xff\xff\xff\xff_U\xc00\xe4\x83\x9e\xcd\x02A@\xfc\xff\xff\xff\xff_U\xc00\xe4\x83\x9e\xcd\x02A@|\x01\x00\x00\x00\xfc\xff\xff\xff\xff_U\xc00\xe4\x83\x9e\xcd\x02A@\xfe'
371403113333811333391381940CDF_2002_55_2225_069945NONFEDST-CACDFST/C&LUSCARRURiverside UnitCARRUCDF - Riverside UnitNaN069945NaNPALMNaNNaNNaNNaNNaN20022452539.5265NaN9.0MiscellaneousNaNNaNNaN0.10A33.881111-117.21805614.0MISSING/NOT SPECIFIEDCANaNNaNNaNb'\x00\x01\xad\x10\x00\x00\x84\xfdC\x9f\xf4M]\xc0\xd8\xcb\xb4?\xc8\xf0@@\x84\xfdC\x9f\xf4M]\xc0\xd8\xcb\xb4?\xc8\xf0@@|\x01\x00\x00\x00\x84\xfdC\x9f\xf4M]\xc0\xd8\xcb\xb4?\xc8\xf0@@\xfe'
491263173801738117466FS-1446808FEDFS-FIRESTATFSUSAZTNFTonto National Forest0312Tonto National Forest25153B6T5HINTONNaNNaNNaNNaNNaN20062453940.52051820.01.0Lightning2453940.5205.02330.00.10A33.851389-110.8825005.0USFSAZ77.0Gilab'\x00\x01\xad\x10\x00\x00\x14\xaeG\xe1z\xb8[\xc00]\xa7O\xfa\xec@@\x14\xaeG\xe1z\xb8[\xc00]\xa7O\xfa\xec@@|\x01\x00\x00\x00\x14\xaeG\xe1z\xb8[\xc00]\xa7O\xfa\xec@@\xfe'
4709241385154138515519080924SFO-GA-DOU-27-3/19/1994-1330NONFEDST-GAGASST/C&LUSGAGASGeorgia Forestry CommissionGAGASGeorgia Forestry CommissionNaN27NaNNaNNaNNaNNaNNaNNaN19942449430.5781330.05.0Debris Burning2449430.578.01647.02.97B33.641500-84.7944008.0PRIVATEGADouglas97.0Douglasb'\x00\x01\xad\x10\x00\x00L\xfc\x18s\xd72U\xc001\x08\xac\x1c\xd2@@L\xfc\x18s\xd72U\xc001\x08\xac\x1c\xd2@@|\x01\x00\x00\x00L\xfc\x18s\xd72U\xc001\x08\xac\x1c\xd2@@\xfe'
4917559196659196661047008SWRA_SC_60640NONFEDST-SCSCSST/C&LUSSCSCSSouth Carolina Forestry CommissionSCSCS6SCS Unit 6NaNE562NaNNaNNaNNaNNaNNaNNaN20002451717.5174NaN1.0LightningNaNNaNNaN0.10A33.350000-80.78330014.0MISSING/NOT SPECIFIEDSCNaNNaNNaNb'\x00\x01\xad\x10\x00\x008\xbdR\x96!2T\xc0\xd8\xcc\xcc\xcc\xcc\xac@@8\xbdR\x96!2T\xc0\xd8\xcc\xcc\xcc\xcc\xac@@|\x01\x00\x00\x008\xbdR\x96!2T\xc0\xd8\xcc\xcc\xcc\xcc\xac@@\xfe'
128037329755329756337192W-511828FEDDOI-WFMIBLMUSIDIFDIdaho Falls DistrictIDSADSalmon Field OfficeNaNNaNB4RDDERIARNaNNaNNaNNaNNaN20052453604.5234133.04.0Campfire2453604.5234.0250.00.10A45.237700-113.9192001.0BLMIDLemhi59.0Lemhib'\x00\x01\xad\x10\x00\x00\xec\x9e<,\xd4z\\\xc08!\x1f\xf4l\x9eF@\xec\x9e<,\xd4z\\\xc08!\x1f\xf4l\x9eF@|\x01\x00\x00\x00\xec\x9e<,\xd4z\\\xc08!\x1f\xf4l\x9eF@\xfe'